elag workshop sessie 3 v4
TRANSCRIPT
![Page 1: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/1.jpg)
3TU.Datacentrum
Workshop session 3
“Darelux”
![Page 2: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/2.jpg)
• Data collection:– Hydrology measurements– Several institutes– Different sensors at multiple
locations– Long term value
• Programme started 2003Data Achiving River Environment LUXembourg
Once upon a time …Once upon a time …
Maisbich1.2 km2
Ardenne Massief (Leisteen)
HuewelerbachCatchment, basin area approx. 2.7 km2, mainly sandstone.
![Page 3: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/3.jpg)
Centre Recherche
Gabriel Lippmann
April 12, 2023 3
Rivier
Interception
Floor
Evaporation Precipitation
Shallow
ground
Deepgroun
d
MeasurementsUniversiteit Utrecht: Bodemvocht
Gabriel Lippmann, TUDPiezometers
TUD: Afvoer over de weg
Universiteit LuxemburgGabriel Lippmann, TUDMeetstuw
Universiteit LuxemburgGabriel Lippmann, TUDTracers
![Page 4: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/4.jpg)
• 3 Soil moisture probes
• 16 Temp. sensor + 1 fibre optic
• 5 V-notch discharge meters
• Meteo station, Pluvio meter, Interception measurement device, Groundwater level piezometer, …
• Data per month
• ASCII files from sensors
Sensors and dataSensors and data
![Page 5: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/5.jpg)
![Page 6: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/6.jpg)
‘Evolution’ in 3TU.DC
5 steps
Each step: considerations & solutions…
![Page 7: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/7.jpg)
• Very high importance on long term preservation
• Limited bandwith for downloads
• No known standards from community
The beginning … (1a/5)The beginning … (1a/5)
![Page 8: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/8.jpg)
• All information (data AND meta data) in single containers.– Relations were considered too risky– ’Homebrew’ xml with very specific tags
• Container size limited to 2MB– Measurements stored per month per sensor *
location.
Xml containers … (1b/5)Xml containers … (1b/5)
![Page 9: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/9.jpg)
Questions?
Differences:
![Page 10: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/10.jpg)
• Need for simpler metadata
• No repetition of metadata (sensor and location for every month)
• Less afraid of long term risks with (archive internal) relations
Another step … (2a/5)Another step … (2a/5)
![Page 11: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/11.jpg)
• New data model and ‘cleaned’ xml– Datasets, Measuring instruments, Locations,
Time
Linked xml … (2b/5)Linked xml … (2b/5)
CollectieC1
ApparaatA1
measuredBy
isMemberOfCollection
Periode(dag)
temporal
locatedAt
longitude 4.3742Periode(maand)
isPartOf
Plaats(gebied)
isPartOf
latitude
51.9973
titleDak van EWI
title windmeter
title Delft
DatasetD2calculatedFrom
Periode(jaar)
isPartOf
title
...
creator...
DatasetD1
Plaats(punt)
DATAdatafile
DATAdatafile
DATAdatafile application/x-netcdfmimeType
created
2011-01-01T00:00:00
Informationresource
Non-informationresource
3 uri’s:- the NIR (#)- html representation- rdf (ORE)
![Page 12: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/12.jpg)
Questions?
Differences:
![Page 13: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/13.jpg)
• Need for more generalisation (find suitable standard for numerical data)
• Binary formats considered too risky for long term preservation
Halfway there … (3a/5)Halfway there … (3a/5)
![Page 14: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/14.jpg)
• NcML (xml of NetCDF)
NcML … (3b/5)NcML … (3b/5)
![Page 15: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/15.jpg)
Questions?
Differences:
![Page 16: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/16.jpg)
• Reduce processing (at ingest and dissemination)
• Increase usability
Almost there … (4a/5)Almost there … (4a/5)
![Page 17: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/17.jpg)
• NetCDF (binary) format– Direct usable with common tools (Matlab,
Python/Java/C, …) after download.
Binary data streams … (4b/5)Binary data streams … (4b/5)
Is NetCDF a Good Archive Format?NetCDF classic or 64-bit offset formats can be used as a general-purpose archive format for storing arrays.Compression of data is possible with netCDF (e.g., using arrays of eight-bit or 16-bit integers to encode low-resolution floating-point numbers instead of arrays of 32-bit numbers), or the resulting data file may be compressed before storage (but must be uncompressed before it is read). Hence, using these netCDF formats may require more space than special-purpose archive formats that exploit knowledge of particular characteristics of specific datasets.With netCDF-4/HDF5 format, the zlib library can provide compression on a per-variable basis. That is, some variables may be compressed, others not. In this case the compression and decompression of data happen transparently to the user, and the data may be stored, read, and written compressed.
![Page 18: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/18.jpg)
Questions?
Differences:
![Page 19: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/19.jpg)
• Wish for ‘Flexible’ granularity
• Further increase usability(lower threshold after steeper learning curve)
• Reduce storage requirements
The ‘last’ step … (5a/5)The ‘last’ step … (5a/5)
![Page 20: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/20.jpg)
• OPeNDAP– Combine sets server side (1 year of data = 1
download instead of 12)– Url queries from common tools– Inspect metadata for each variable or
dimension– ‘Cut, slice & sample’ in datasets server side
• Only binary formats stored– Approx. 75% reduction in size (compared to
only xml format)
Interaction with NetCDF … (5b/5)Interaction with NetCDF … (5b/5)
![Page 21: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/21.jpg)
![Page 22: Elag workshop sessie 3 v4](https://reader033.vdocuments.site/reader033/viewer/2022052505/55639533d8b42acc128b554e/html5/thumbnails/22.jpg)
Questions?
Differences: