Download - Data Standards Workflow
Data Standards Workflow
Raw data Scripts Database
Store raw data in subversion to
keep track of history
Stored files (netcdf)
accessible through the web
Extract Transform Load
Charts & Maps
Tools and websites
Provide
Add meta information
Script to convert raw data into
netcdf
OpenEarthRawData
OpenEarth
OPeNDAP
OpenEarthTools
Data Standards Workflow
Raw data Scripts Database
Store raw data in subversion to
keep track of history
Stored files (netcdf)
accessible through the web
Extract Transform Load
Charts & Maps
Tools and websites
Provide
Add meta information
Script to convert raw data into
netcdf
OpenEarthRawData
OpenEarth
OPeNDAP
OpenEarthTools
Transform
• Add metadata• Store in netcdf• Save script in subversion
Add metadata• Use the inspire meta data form to store
information about the dataset.• http://www.inspire-geoportal.eu/inspireEditor.htm• Click launch editor
Transform
Turn validation on
Transform – add metadata
validation
Location in subversion
micore
File identificationTransform – add metadata
History of your data.
Transform – add metadata
quality
Please fill in limitations of use.
Transform – add metadata
constraints
Store in course/Pcnumber/inspire_description.xml
Transform – add metadata
Save metadata file1. Save metadata file (local)2. Add to subversion (local)3. Commit => metadata into subversion (remote)
Transform
• Add metadata• Store in netcdf• Save script in subversion
Store in netcdf
• What’s netcdf?• Write a script to transform data into netcdf• Using CF convention
Transform
What is netcdf
• Data format defined by unidata• Data store used for coverage data and
multidimensional data• CF Metadata convention
Transform – store in netcdf - netcdf
What is netcdf
XX
ZZ
TT
YY
• An array based data structure for storing multidimensional data
• N-dimensional coordinates systems• X coordinate (e.g. longitude)• Y coordinate (e.g. latitude)• Z coordinate (e.g. altitude)• Time dimension• … other dimensions
• Variables – support for multiple variables• Temperature, humidity, pressure, salinity, etc
• Geometry – implicit or explicit• Regular grid (implicit)• Irregular grid• Points
TransformTransform – store in netcdf - netcdf
Storing Multidimensional Data
X Y Z Q
1 1 1 0.5
1 1 2 0.3
1 2 1 0.6
1 2 2 0.1
2 1 1 0.4
2 1 2 0.2
2 2 1 0.9
2 2 2 0.3
0.5 0.4
0.6 0.9
0.3 0.2
0.1 0.3
1 2
1
2
1
2
X Y Z
32 numbers14 numbers
Transform – store in netcdf - netcdf
Data Model
Data model for netcdf and others.
Also usable for hdf, opendap, grib, etc. See the java library for details
Transform – store in netcdf - netcdf
ArcGis
ArcGis also reads and writes netcdf files.
Transform – store in netcdf – netcdf - applications
Your favorite text editorxml representation of a netcdf file
Transform – store in netcdf - netcdf
Other Tools
NCO#diffncdiff -v time file1.nc file2.nc#compression & packingncpdq -4 -L 9 in.nc out.nc # Deflated packing (~80% lossy compression)#selecting variables by regexncks -v '^Q..' in.nc # Q01--Q99, QAA--QZZ, etc.
IDVVery useful
Web hyperslabs, cool!
Not so stable.
Transform – store in netcdf - netcdf
Data Standards Workflow
Raw data Scripts Database
Store raw data in subversion to
keep track of history
Stored files (netcdf)
accessible through the web
Extract Transform Load
Charts & Maps
Tools and websites
Provide
Add meta information
Script to convert raw data into
netcdf
OpenEarthRawData
OpenEarth
OPeNDAP
OpenEarthTools
Store in netcdf
• What’s netcdf?• Write a script to transform data into netcdf• Using CF convention
Transform – store in netcdf - script
Write script
• Read raw data• Read header line• Read data• Read all data• Create function to read all data• Use function in Matlab
• Raw data into empty netcdf file• Create empty netcdf file• Add dimensions and variables• Store variables
• Read values
Transform – store in netcdf - script
Reading raw data into memory
• Use one of the following matlab functions to read the file data into an array• fscanf
Transform – store in netcdf - script
Example: Transect.txt file
1999 58 -135 3531 -130 3541 -125 3631 -120 4171 -115 6221 -110 8231 -105 9841 -100 10971 -95 12171 -90 12951… 200 -2415 210 -2995 220 -3595 99999999999 99999999999 2000 58 -135 3531 -130 3541 -125 3631 -120 4171 -115 6221 -110 8231 -105 9841 -100 10971 -95 12171 -90 12951
Header lineYear
number of points
PointsX Z X Z …. 9999999
Location: OpenEarthRawData\course\example\raw
Transform – store in netcdf - script
Read header line
>> fid = fopen('..\raw\transect.txt')fid = 15
>> header = fscanf(fid, '%d', 2)header = 2000 58
>> year = header(1)year = 2000
>> npoint = header(2)npoint = 58
Transform – store in netcdf - script
% read header header = fscanf(fid, '%d', 2); year = header(1); % store year in time time(i) = year; npoint = header(2); % read data data = fscanf(fid, '%d', npoint*2); data = reshape(data, [2, npoint]); % use column vectors data = data';
Read data>> % read datadata = fscanf(fid, '%d', npoint*2)
data = -150 3741 -140 3581 -135
>> data = reshape(data, [2, npoint])
data = Columns 1 through 7
-150 -140 -135 -130 3741 3581 3531 3541
1
2
>> % use column vectorsdata = data'
data = -150 3741 -140 3581 -135 3531
3
Transform – store in netcdf - script
Read all data% preallocate all data % (time, coastward)transectseries = NaN(3, 58);coastward_distance = NaN(58, 1);time = NaN(3, 1);% open file and get file idfid = fopen('..\raw\transect.txt');i = 1;while (~feof(fid)) % read header header = fscanf(fid, '%d', 2); year = header(1); % store year in time time(i) = year; npoint = header(2); % read data data = fscanf(fid, '%d', npoint*2); data = reshape(data, [2, npoint]); % use column vectors data = data' % store data in transect series transectseries(i,:) = data(:,2); coastward_distance(:) = data(:,1); fgetl(fid); i = i + 1;end
Transform – store in netcdf - script
Create a functionfunction transect = readtransect(filename)% preallocate all data % (time, coastward)transectseries = NaN(3, 58);coastward_distance = NaN(58, 1);time = NaN(3, 1);% open file and get file idfid = fopen(filename);i = 1;while (~feof(fid)) % read header header = fscanf(fid, '%d', 2); year = header(1); % store year in time time(i) = year; npoint = header(2); % read data data = fscanf(fid, '%d', npoint*2); data = reshape(data, [2, npoint]); % use column vectors data = data'; % store data in transect series transectseries(i,:) = data(:,2); coastward_distance(:) = data(:,1); fgetl(fid); i = i + 1;endtransect = struct('series', transectseries, … 'distance', coastward_distance, 'time', time);end
Transform – store in netcdf - script
Use the new function
>> data = readtransect('..\raw\transect.txt')
data =
series: [3x58 double] distance: [58x1 double] time: [3x1 double]
Transform – store in netcdf - script
Loading data into netcdf
• What does a netcdf file look like• Required meta information
Transform – store in netcdf - script
Netcdf filetransect.ncnetcdf transect {dimensions: coastward = 58 ; time = 3 ;variables: float coastward_distance(coastward) ; coastward_distance:unit = "metre" ; float year(time) ; year:unit = "year" ; float height(time, coastward) ; height:unit = "metre" ;data:
coastward_distance = -135, -130,…, 150, 160, 170, 180, 190, 200, 210, 220 ; year = 1999, 2000, 2001 ; height = 353, 354, … -142, -146, -170, -206, -232, -273, -309, -346, -375, -388, … -32, … -92, -110, -127, -143, -156, -177, -211, -259, -303, -334 ;}
Transform – store in netcdf - script
Create an empty netcdf file
>> nc_create_empty(outputfile)>> nc_dump(outputfile)netcdf transect.nc {
dimensions:
variables:
}
Transform – store in netcdf - script
Add dimensions
nc_add_dimension(outputfile, 'crossshore', 58)nc_add_dimension(outputfile, 'time', 3)nc_dump(outputfile)>>netcdf transect.nc {
dimensions:coastward = 58 ;time = 3 ;
variables:}
help nc_add_dimension
Transform – store in netcdf - script
Add variablescrossshoreVariable = struct(... 'Name', 'crossshore_distance', ... 'Nctype', 'float', ... 'Dimension', {{‘crossshore'}}, ... 'Attribute', struct('Name', 'unit', 'Value', 'metre') ... );nc_addvar(outputfile, crossshoreVariable);timeVariable = struct(... 'Name', 'year', ... 'Nctype', 'float', ... 'Dimension', {{'time'}}, ... 'Attribute', struct('Name', 'unit', 'Value', 'year') ... );nc_addvar(outputfile, timeVariable);heightVariable = struct(... 'Name', 'height', ... 'Nctype', 'float', ... 'Dimension', {{'time', ‘crossshore'}}, ... 'Attribute', struct('Name', 'unit', 'Value', 'metre') ... );nc_addvar(outputfile, heightVariable);nc_dump(outputfile)
help nc_addvar
Transform – store in netcdf - script
Result
netcdf transect.nc {
dimensions:coastward = 58 ;time = 3 ;
variables:float coastward_distance(coastward), shape = [58]
coastward_distance:unit = "metre" float year(time), shape = [3]
year:unit = "year" float height(time,coastward), shape = [3 58]
height:unit = "metre"
}
Transform – store in netcdf - script
Store variables
nc_varput(outputfile, 'height', data.series)nc_varput(outputfile, 'year', data.time)nc_varput(outputfile, 'coastward_distance', data.distance)
help nc_varput
Transform – store in netcdf - script
Result: Netcdf filetransect.ncnetcdf transect {dimensions: coastward = 58 ; time = 3 ;variables: float coastward_distance(coastward) ; coastward_distance:unit = "metre" ; float year(time) ; year:unit = "year" ; float height(time, coastward) ; height:unit = "metre" ;data:
coastward_distance = -135, -130,…, 150, 160, 170, 180, 190, 200, 210, 220 ; year = 1999, 2000, 2001 ; height = 353, 354, … -142, -146, -170, -206, -232, -273, -309, -346, -375, -388, … -32, … -92, -110, -127, -143, -156, -177, -211, -259, -303, -334 ;}
Transform – store in netcdf - script
Read values
surface(nc_varget(outputfile, 'height')')
11.5
22.5
3
020
4060
-5000
0
5000
10000
15000
Transform – store in netcdf - script
Store in netcdf
• What’s netcdf?• Write a script to transform data into netcdf• Using CF convention
Transform – store in netcdf - convention
CF convention
Standard used by USGS, NOAA, Arcgis, GDAL
Climate and Forecast (CF) Conventionhttp://www.unidata.ucar.edu/software/netcdf/docs/conventions.html
Initially developed for• Climate and forecast data• Atmosphere, surface and ocean model-generated data• Also used for observational datasets• CF is the most widely used convention for geospatial netCDF
data.
Transform – store in netcdf - convention
Improve output
• Store extra attributes• Title• Author• Standard_name
Transform – store in netcdf - convention
Transform
• Add metadata• Store in netcdf• Save script in subversion
Transform – save script
Save script1. Save script (local, using matlab
https://repos.deltares.nl/repos/OpenEarthRawData/course/PCnr/scipts/)2. Add to subversion (local)3. Commit => script into subversion (remote)