world data centres — past, present and future

6
Jouml of Atmospheric and Terrestrial Physics, Vol. 56. No. 7, pp 865870, 1994 Elsevier Science Ltd P&cd in Great Britain cm-9169194 $6.00 + 0.00 World Data Centres - past, present and future STANLEY RUTTENBERG’ l and HENRYRISEIBETH~~~ B 1. National Center for Atmospheric Research, P.O.Box 3000, Boulder CO 80307, U.S.A.; 2. Rutherford Appleton Laboratory, Chifton, Didcot, Oxfordshire OXIl OQX, U.K.; 3. Department of Physics, University of Southampton, Southampton SO9 5NH, U.K. Chairman* and Secretaryt, ICSU Panel on World Data Centres (Received in finalform 26 November 1993; accepted 26 November 1993) Abstract-This review of the ICSU World Data Centre system is offered as a tribute to Sir Granville Beynon and his coReagues, whose vision led to the setting up of World Data Centres for the Jirtemationai Geophysical Year of 19.57-1958. The article reviews the development of the WDC system, its place in the current scientific scene, and some of the issues that it faces. 1. INTRODUCTION Scientific data gathering has a Information about solar and auroral long history. activity in past . _ millennia was chronicled by the Chmese and other peoples. In the Western world, systematic geophysical measurements extend back for centuries, but mechanisms for data distribution and exchange are more recent. In the 18th and 19th centuries, data were exchanged from the early geomagnetic and seismic observatories largely through publications of annual station books. Oceanographic and geological expeditions were recorded in expedition reports. Aithough there were no convenient ways to copy the original records, our knowledge of the geomagnetic field, plate tectonics and ocean currents owe much to these records. Benjamin Franklin in the 1770s and Matthew Maury a century later collected oceanographic data from the Atlantic directly from ship captains, which led to the first synoptic oceanographic study - of the Gulf Stream. From about 1860 the telegraph made it possible to use data from meteorological stations for weather forecasting, incidentally validating Franklin’s hypothesis that weather moves from West to East. Maury advocated international collaboration in data gathering, which helped to stimulate the International Polar Years of 1882-1883 and 1932-1933, and eventu~ly led to the Intemation~ Geophysical Year of 1957-1958. To serve the IGY, the World Rata Centre system was established through the work of Granville Beynon and his many colleagues. He maintained his connection with the WDCs in varied ways, latterly through many years of his leadership of the international Panel on World Data Centres. This brief account of the WDC system therefore fits well into this journal issue that honours Granville Beynon’s life and work. 2. TElE IGY AND ITS AFlERMATH Planning of the IGY was coordinated by CSAGI, the Special Committee for the IGY set up by the International Council of Scientific Unions (ICSU). At the 1955 Brussels meeting of CSAGI, Granville Beynon and his colleagues reasoned that traditional records were not enough for modem research. They decided that the IGY data collections, both from

Upload: stanley-ruttenberg

Post on 25-Aug-2016

224 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: World Data Centres — past, present and future

Jouml of Atmospheric and Terrestrial Physics, Vol. 56. No. 7, pp 865870, 1994 Elsevier Science Ltd

P&cd in Great Britain cm-9169194 $6.00 + 0.00

World Data Centres - past, present and future

STANLEY RUTTENBERG’ l and HENRY RISEIBETH~~~ B

1. National Center for Atmospheric Research, P.O.Box 3000, Boulder CO 80307, U.S.A.;

2. Rutherford Appleton Laboratory, Chifton, Didcot, Oxfordshire OXIl OQX, U.K.;

3. Department of Physics, University of Southampton, Southampton SO9 5NH, U.K.

Chairman* and Secretaryt, ICSU Panel on World Data Centres

(Received in finalform 26 November 1993; accepted 26 November 1993)

Abstract-This review of the ICSU World Data Centre system is offered as a tribute to Sir Granville Beynon

and his coReagues, whose vision led to the setting up of World Data Centres for the Jirtemationai Geophysical

Year of 19.57-1958. The article reviews the development of the WDC system, its place in the current scientific

scene, and some of the issues that it faces.

1. INTRODUCTION

Scientific data gathering has a Information about solar and auroral

long history. activity in past

. _ millennia was chronicled by the Chmese and other peoples. In the Western world, systematic geophysical measurements extend back for centuries, but mechanisms for data distribution and exchange are more recent. In the 18th and 19th centuries, data were exchanged from the early geomagnetic and seismic observatories largely through publications of annual station books. Oceanographic and geological expeditions were recorded in expedition reports. Aithough there were no convenient ways to copy the original records, our knowledge of the geomagnetic field, plate tectonics and ocean currents owe much to these records.

Benjamin Franklin in the 1770s and Matthew Maury a century later collected oceanographic data from the Atlantic directly from ship captains, which led to the first synoptic oceanographic study - of the Gulf Stream. From about 1860 the telegraph made it possible to use data from meteorological stations for weather forecasting, incidentally validating Franklin’s hypothesis that weather moves

from West to East. Maury advocated international collaboration in data gathering, which helped to stimulate the International Polar Years of 1882-1883 and 1932-1933, and eventu~ly led to the Intemation~ Geophysical Year of 1957-1958.

To serve the IGY, the World Rata Centre system was established through the work of Granville Beynon and his many colleagues. He maintained his connection with the WDCs in varied ways, latterly through many years of his leadership of the international Panel on World Data Centres. This brief account of the WDC system therefore fits well into this journal issue that honours Granville Beynon’s life and work.

2. TElE IGY AND ITS AFlERMATH

Planning of the IGY was coordinated by CSAGI, the Special Committee for the IGY set up by the International Council of Scientific Unions (ICSU). At the 1955 Brussels meeting of CSAGI, Granville Beynon and his colleagues reasoned that traditional records were not enough for modem research. They decided that the IGY data collections, both from

Page 2: World Data Centres — past, present and future

866 S. RUTI’ENBERG and H. RISHBETH

routine monitoring instruments and from special experiments, should be preserved in permanent data centres for future use. The f~ioning of the WDC system is well documented in the Annals ofthe ZGY. Explicit data management plans were developed for each IGY discipline, stating in detail the types of data, and their time schedules and formats, that were to be submitted to the WDCs for distribution and exchange, and archived for future research. These specifications were published in a series of Guides to Data Exchange which, updated over the years, remain as the standards for data exchange. The IGY planners were remarkably prescient: the 1955 recommendation mentioned that data centres should be prepared to handle data in machine-ridable form, which at that time meant punched cards and punched tape.

National IGY Committees were invited to establish and operate World Data Centres in one or more disciplines, at national expense, abiding by rules promulgate by CSAGI. A~ordingly, the USA and USSR offered to establish complex centres embracing all disciplines (respectively known as WDC-A and WDC-B). In most disciplines there was a third or even a fourth centre, known as WDC-Cl if in Western Europe and WDC-C2 if in Asia or Australia (the European centres being known simply as WDC-C if there is no corresponding WDC-C2). Multiple centres were deemed advisable to guard against catastrophic loss and for the convenience of the senders and users of the observational data.

The IGY programmes were limited and the WDC system did not cover all fields of interest. It did not include hydrology or geology (though some geological data were acquired by WDCs); exchange of glaciological data was limited to bibliographies of published papers; data exchange in meteorology was limited; and the monitoring of nuclear radiations became defunct (some monito~ng systems were revived following the Chernobyl incident). Although the WDCs for Rockets and Satellites were established, and received information on launches and orbits of spacecraft, the satellite-based scientific data were not systematically exchanged in the IGY, partly because of experimenters’ privileges. In time, however, many of these data did reach WDCs.

ICSU has set up other organizations that deal with data. The Federation of Astronomical and Geophysical Services, formalized in 1956, covers similar areas of science to those of the WDC

system, but the functions are different. FAGS centres process large amounts of data to derive indices or summ~es ch~c~~zing the dyn~ics of the Earth system. It is not their prime responsibility to archive and distribute the raw data. There is, however, some overlap with the activities of WDCs and national centres, and some FAGS centres are also WDCs. The task of CODATA, the ICSU Committ~ on Data for Science and T~hnology formed in 1966, has centred on compiling and reviewing data on physical and chemical constants and on the properties of chemical and biological substances and materials.

3. POST-ICY DEVELxOPMENW

Because of its success, the WDC system was made permanent and used for post-IGY data. New programmes evolved, based on the IGY structure as a general f~mework, such as the Intemation~ Quiet Sun Year of 1964-1965, the Intemation~ Magnetospheric Study of 1976-1979, the Solar Maximum Year of 1979-1981 and the Middle Atmosphere Programme of 1982-1985. Most of the sponsoring national bodies agreed to continue the WDCs to serve these descendent prog~mmes and, thanks to good planning, the data collections have remained accessible to users.

During the 1960s and 1970s several developments took place that had implications for the WDC system. The discipline of solar-terrestrial physics emerged, embracing many IGY subjects: the Sun, the solar wind and interplanetary magnetic field, cosmic rays, geomagnetism, ionosphere, aurora and airglow. Several data centres were reorganized to create new combined STP data centres, and some ground-based observational networks were replaced by satellites. Some IGY centres closed, generally because of the loss of a significant user immunity in the host country, and in some cases because their scientific discipline became obsolete or inactive. New centres were created, for example to serve solid earth and marine geophysics.

Some national scientific agencies developed extensive National Data Centres, few of which existed during the IGY. Many WDCs are housed within these NDCs, and the operational and financial boundaries between them may be blurred. Through their WDCs, the NDCs maintain their commitments

Page 3: World Data Centres — past, present and future

World data centres - past, present and future 867

to international exchange and service to any scientist who needs data. The NDC/WDC combination is a powerful one that maintains national systems to serve national needs, in which the WDC components can continue to ensure unr~~ct~ av~labili~ of data.

The Intergovernmental Oceanographic Commission (IOC) was established by UNESCO to coordinate and sponsor oceanographic programmes, mainly for operational use. IOC developed an extensive data centre system and guides for the exchange of oceanographic data, now merged with the ICSU WDC system for ~~og~phy. The World Meteorological Organization (WMO) established a meteorological data centre system, supporting the Global Atmospheric Research Programme which spanned the period 1967-1979, and the current World Climate Research Programme.

science has been established in the Netherlands and three new centres in the USA: WDC-A for Trace Gases at the Carbon Dioxide Information Analysis Centre at Oak Ridge; WDC-A for Remotely Sensed Land Data at the EROS Data Centre, Sioux Falls; and WDC-A .for P~a~limatology at the National Geophysical Data Centre in Boulder (see Table 1).

The present duties and activities of WDCs (which apply also to many national data centres) may be listed as follows:

(a) Collecting and cataloguing data and information.

(b) Compiling data sets for a wide variety of small-scale, regional and global geophysical research.

(c) Maintaining the data (whether stored on paper, film, magnetic tape or modem media) in good condition.

Since the IGY, the gathering and exchange of data has been t~sfo~~ by immen~ ~hnologi~~ advances. These advances have included the replacement of analogue with digital instruments, the networking of digital instruments to simplify the collection and exchange of their data, and automatic observatories that operate unattended, sometimes for months. Examples are provided by ionospheric, geomagnetic, seismic, and meteorological stations, and upper air soundings. Personal computers, more powerful than the su~rcomputers of the 1970s are now ubiquitous, together with compact disk readers. Many WDCs are now publishing collections of digital data sets on compact disks for cheap and easy distribution. Digital communication networks have made it possible to transfer large data and program files by electronic mail.

(d) Making data accessible through copying and dist~buting data (for WDC operations, at minimum costs of copying).

(e) Preserving important old data sets by converting them from tabular to digital form.

(f) Making data sets available on such media as compact disks, giving users the ability to search large data collections and recompile the data sets in their home laboratory.

(g) Compiling ‘data products’, e.g. by combining data from many sources to derive geophysical indices.

(h) Compiling numerical models to describe the time-varying and space-varying geophysical environment (e.g. the geomagnetic field and the upper atmosphere).

(i) Maintaining and updating on-line information services related to the above activities.

4. THE WORtD DATA CENTRE SYSTEM In pursuit of these objectives:

Today the WDC system is healthy and viable. The 44 centres are mostly maintaining their funding, though not without struggle. In 1988 a complete set of new centres was established in China as WDC-D. Some of the WDC-B in Russia and the WDC-C2 in Japan and India have been reorganized. In recognition of the increasing interest in environmental data in general, and the needs of the IGBP (International Geosphere-Biosphere Programme) in particular, a new WDC-C for soil

(i) The WDC system has initiated a ‘data rescue’ programme - finding older data sets at risk because of physical deterioration, changes in policy, or reorganization of the institutions that hold the data, and taking steps to safeguard such data.

(ii) Collaborative projects are under way between WDCs-A, -B, -C and -D to achieve another kind of ‘data rescue’, the digitization of (for example) old geomagnetic and ionospheric datasets.

(iii) In particular the WDC Panel, WDC-A and

Page 4: World Data Centres — past, present and future

868 S. RU’ITENBE~RG and H. RISHB~H

WDC-D have initiated a project to bring data from the large territory of China into the global change database.

(iv) Visitor programmes to bring scientists into close contact with the data holdings and the professional staff of the WDCs are being expanded.

(v) WDCs are working with data originators to improve data documentation to enable future use of the data.

(vi) Related data sets are being compiled into databases, often in common data structures, to facilitate multidisciplinary research and multivariate analyses in models, and published on diskettes and compact disks.

The above considerations imply questions of priority, and there are other specific issues and problems, such as:

* Most scientists are unwilling to give priority to data management, especially when data projects compete with what they perceive as “real” science. Nevertheless, investigators generally react positively to offers by WDCs to assist with documenting and archiving datasets for placing in the public domain.

* There is increasing difficulty - by no means confined to developing countries with their special problems - in maintaining data flow from the regular monitoring networks (e.g. geomagnetic, ionospheric, cosmic ray). This applies less to (e.g. meteorological) networks that serve short-term operational requirements, but difficulties may arise in acquiring and preserving their data for longer-term research.

* Biospheric and human-activity data needed for global change studies, such as mapping data and information that may be non-numerical and non- continuous (e.g. soil types, vegetation types, land-use) are hard to handle.

* Technical issues of ageing, error growth and ultimate lifetimes of new data storage media (such as compact disks) need to be assessed.

WDCs cost money. Data services are expensive, though some of them have a long-term effect of transferring large data sets from the WDCs to the users, thereby relieving the WDCs of some routine duties and enabling them to undertake further innovative developments. In general the WDCs and

NDCs are funded at levels insufficient to do all the data management work needed by the community, but they have nevertheless kept up with many aspects of rapidly evolving technology.

Since many WDCs are associated with a national data centre (NDC), it is difficult to estimate how much the WDC system costs to run. Based on the experience of WDC-A, it may be estimated that WDC costs are about 510% of the total costs of the national centres and their data handling activities, currently of order $SOM/year in the USA (compared to the order of lM$/year in the IGY) and perhaps $lSOM/yea.r worldwide. This implies that the national bodies spend around $ lOM/ year to maintain their WDCs and the related international obligations, whereas ICSU spends only $lSWyear on its coordinating and promotional role. Though very rough, these numbers show that the incremental costs of an efficient WDC system are small compared with the expenditure on national data centres.

5. CONTEMPORARY OPPORTUNITES AND

ISSUES

In the 199Os, the WDC system has to serve international scientific research programmes that aim to describe the complex, non-linear and interactive Earth system, with an ultimate goal of predicting its evolution and future state. The major ICSU programmes are the Solar-Terrestrial Energy Programme (STEP, 1990-1997), the International Geosphere-Biosphere Programme (IGBP, 1991- 2000), and the International Decade for Natural Disaster Reduction (IDNDR, started in 1990). Of these, STEP is in a field long served by WDCs, but the others embrace disciplines and types of data not hitherto familiar to the WDC system and present new challenges. The overall aim requires at least the following:

First, the unrestricted exchange of environmental data which is a sine qua non of any research programme to understand the Earth and its variability. The data are needed (i) to describe the boundary conditions that define the present state of the Earth’s climate and biospheric systems, (ii) to understand the workings of myriad individual physical and biospheric processes involved in the global system, and (iii) to monitor the progressive

Page 5: World Data Centres — past, present and future

World data centres - past, present and future a69

effects of those processes. Second, updating the historical records, beginning

with the modern instrumental era which began some 200 years ago, to provide the longer-term context within which to study the present variability of the Earth system. This necessitates the removal of data artifacts caused by changes in instruments, location, local environment and analysis techniques and, where feasible, conversion of the records to digital form. Examples are the long time series of sea-surface temperatures, starting from the 1770s; historical records (e.g. crop yields, and population and tax records) from which to infer the climate record for 2~-4~ years; 4000 years of Chinese and Korean observations of sunspots; tree-ring data from many regions; and isotopic analysis of ice cores from polar regions and high glaciers.

ZBhird, providing easy-to-use directories which tell users which data sets are available, their contents, coverage, format, and how they may be obtained. WMO’s INFOCLIMA is a good example. For global change research, new directories are being assembled by various national space and environmental agencies. Some are available online to any scientist with communication links and on computer-readable media for users without such links.

We turn to the question of the availability of data. In its Agenda 21, the 1992 United Nations Conference on Environment and Development held in Rio de Janeiro called strongly for intemation~ collaboration in data exchange. The USA tabled its official policy of unrestricted access to environmental data at minimum cost to users, which is consistent with the principles of the ICSU World Data Centre system and serves as a model for other nations to follow. Some agencies, however, espouse the idea that data have monetary value, which hinders the exchange of geophysical and environmental data and obstructs research. This ‘data market’ trend is dangerous, and may cut off data from those who try to sell their own data (in any case, the sale of data seldom covers more than a small fraction of the real cost of acquisition). Natural systems are transnational; no nation can hope to unders~d its own environment situation in ~ything less than a continental or global context.

A major challenge for the WDC system is to define the WDCs’ role in handling the huge data

streams of the major new projects, the Global Ocean Observing System and the Global Climate Observing System. The data and derived products from GOOS and GCOS are designed for operational use - weather, ocean state, etc. -but they have enormous potential for long-term research.

A word of caution is necessary. Years may pass before modem facilities are available to scientific communities everywhere, and each new development may bring a danger of enhancing the divisions between the scientific “haves” and “have-not?. The present-day WDC system sees it as an important part of its task to promote access to modem information technology for scientists in developing countries.

6. CONCLUSION

In addressing these new questions, which are also opportunities, we need the guidance which such wise colleagues as GranviIle Beynon gave during IGY and the subsequent explosion of g~physi~ research and data. An illuminating story that Granville was fond of telling comes to mind here:

Two Welshmen were playing cards. Taffy dealt a hand, and smiled satisfactorily to himself. However, when his partner Dai played his first card, Taffy stood up, scowled, threw his cards down on the tables, and roared:

‘There is nothing I think worse than to play cards with a cheat! You are not playing the cards I dealt you!’

The very wise colleagues who planned and designed the IGY gave scientists interested in data management very good cards indeed, and they were played well to build a data centre system which has grown and thrived through major changes in international research programmes. Data management is now recognized in its own right as an important branch of science and technology. Times are changing fast, and we must find the right new cards to deal to our colleagues who have to fight for the resources to provide the best data services for our future global research programmes.

Page 6: World Data Centres — past, present and future

870

WDC-A

WDC-A

WDC-A

WDC-A

WDC-A

WDC-A

WDC-A

WDC-A

WDC-A

WDC-A

WDC-A

WDC-A

WDC-Bl

WDC-B 1

WDC-B2

WDC-B2

WDC-B

WDC-C

WDC-C 1

WDC-C 1

WDC-C

WDC-C

WDC-c

WDC-c

WDC-Cl

WDC-C

WDC-c2

WDC-C2

WDC-C2

WDC-C2

WDC-C2

WDC-c2

WDC-c2

WDC-C2

WDC-C2

WDC-D

WDC-D

WDC-D

WDC-D

WDC-D

WDC-D

WDC-D

WDC-D

WDC-D

S. RUITENBERG and H. RISHBETH

Table 1 - WORLD DATA CENTRES 1993

Atmospheric Trace Gases

Glaciology

Marine Geology & Geophysics

Meteorology

Oceanography

Palaeoclimatology

Remotely Sensed Land Data

Rockets & Satellites

Rotation of the Earth

Seismology

Solar-Terrestrial Physics

Solid Earth Geophysics

Meteorology

Oceanography

Solar-Terrestrial Physics

Solid Earth Geophysics

Marine Geology & Geophysics

Earth Tides

Geomagnetism

Geomagnetism

Glaciology

Recent Crustal Motions

Soil Geography & Classification

Solar Activity

Solar-Terrestrial Physics

Sunspot Index

Airglow

Aurora

Cosmic Rays

Geomagnetism

Geomagnetism

Ionosphere

Nuclear Radiation

Solar Radio Emissions

Solar-Terrestrial Activity

Astronomy (Solar)

Geology

Geophysics

Glaciology & Geocryology

Meteorology

Oceanography

Renewable Resources & Environment

Seismology

Space Sciences

Oak Ridge

Boulder

Boulder

Asheville

Washington

Boulder

Sioux Falls

Greenbelt

Washington

Denver

Boulder

Boulder

Obninsk

Obninsk

Moscow

Moscow

Gelendzhik

Brussels

Copenhagen

Edinburgh

Cambridge

Prague

Wegeningen

Meudon

Chilton

Brussels

Tokyo

Tokyo

Toyokawa

Bombay

Kyoto

Tokyo

Tokyo

Nobeyama

Sagamihara

Beijing

BelJing

Beijing

Lanzhou

Beijing

Tianjin

Beijing

Beijing

Beijing

TN

co

co

NC

DC

co

SD

MD

DC

co

co

co

USA

USA

USA

USA

USA

USA

USA

USA

USA

USA

USA

USA

Russia

Russia

Russia

Russia

Russia

Belgium

Denmark

United Kingdom

United Kingdom

Czech Republic

Netherlands

France

United Kingdom

Belgium

Japan

Japan

Japan

India

Japan

Japan

Japan

Japan

Japan

China

China

China

China

China

China

China

China

China