dsd-int 2015 - datacubes understanding big eo data better - peter baumann
TRANSCRIPT
![Page 1: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/1.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Delft Software Days, Deltares, Delft, 2015-oct-26
Peter Baumann
Jacobs University | rasdaman GmbH
Datacubes:
Exploiting Big Earth Data Better
[co-funded by EU through EarthServer 1/2, PublicaMundi]
[gamingfeeds.com]
![Page 2: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/2.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
![Page 3: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/3.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Datacube Research
@ Jacobs U
Large-Scale Scientific Information Systems
research group
• Flexible, scalable n-D array services
• www.jacobs-university.de/lsis
Main results:
• pioneer Array DBMS, rasdaman
• standardization:
• OGC Big Geo Data (also ISO, INSPIRE, W3C)
• ISO „Science SQL“
Hiring PhD students, PostDocs
![Page 4: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/4.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
![Page 5: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/5.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
sensor feeds
Data Homogenization: Making Life Simpler
5
coverage
server
sensor, image [timeseries], simulation, statistics data
![Page 6: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/6.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
OGC Web Coverage Service (WCS)
OGC/ISO Coverages: regular & irregular grids, point clouds, meshes
- OGC Coverage
Implementation Schema
Large, growing
implementation basis:
rasdaman, GDAL, QGIS,
OpenLayers, OPeNDAP,
MapServer, GeoServer,
NASA WorldWind, EOx-
Server; Pyxis, ERDAS,
ArcGIS, ...
WCS Core: access to spatio-temporal coverages & subsets
- subset = trim | slice
WCS Extensions: optional functionality facets
- Scaling, CRS transformation, …, Analytics (WCPS)
![Page 7: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/7.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Array Analytics
Array Analytics :=
Efficient analysis on multi-dimensional arrays of a size several orders of
magnitude above evaluation engine„s main memory
Essential data property: n-dimensional Euclidean neighborhood
- Secondary: #dimensions, density, ...
Operations: Linear Algebra [M. Stonebraker],
statistics, image/signal processing
![Page 8: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/8.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Datacube Access: A Simple Example
t
![Page 9: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/9.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
A Brief History of Array DBMSs
first appearance in literature (not first implementation)
![Page 10: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/10.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Agile Array Analytics: rasdaman
„raster data manager“: SQL + n-D arrays
Scalable parallel “tile streaming” architecture
Blueprint for ISO Array SQL standard
[rasdaman visitors]
![Page 11: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/11.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Array SQL
select id, encode(scene.band1-scene.band2)/(scene.nband1+scene.band2)), „image/tiff“ )
from LandsatScenes
where acquired between „1990-06-01“ and „1990-06-30“ and
avg( scene.band3-scene.band4)/(scene.band3+scene.band4)) > 0
create table LandsatScenes(
id: integer not null, acquired: date,
scene: row( band1: integer, ..., band7: integer ) mdarray [ 0:4999,0:4999] )
![Page 12: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/12.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Direct Data Visualization
select
encode(
struct {
red: (char) s.b7[x0:x1,x0:x1],
green: (char) s.b5[x0:x1,x0:x1],
blue: (char) s.b0[x0:x1,x0:x1],
alpha: (char) scale( d, 20 )
},
“image/png"
)
from SatImage as s, DEM as d
[JacobsU, Fraunhofer; data courtesy BGS, ESA]
![Page 13: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/13.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Goal: faster loading by adapting storage units to access patterns
Approach: partition n-D array into n-D partitions („tiles“)
Tiling classification based on degree of alignment [ICDE 1999]
Partitioned Array Storage
chunking [Sarawagi,
Stonebraker, DeWitt, ... ]
![Page 14: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/14.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Tiling: Tuning Data for Applications
tiling strategies as service tuning [Furtado]:
- regular directional area of interest
rasdaman storage layout language
insert into MyCollection
values ...
tiling area of interest [0:20,0:40], [45:80,80:85]
tile size 1000000
index d_index storage array compression zlib
„chunks“
[Sarawagi,
DeWitt, ...]
![Page 15: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/15.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Why Irregular Tiling?
e-Science often uses irregular partioning
[OpenStreetMap]
[Centrella et al: scidacreviews.org]
![Page 16: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/16.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Parallel / Distributed Query Processing
1 query 1,000+ cloud nodes
[SIGMOD DANAC 2014]
Dataset B
Dataset A
Dataset D
Dataset C
select
max((A.nir - A.red) / (A.nir + A.red))
- max((B.nir - B.red) / (B.nir + B.red))
- max((C.nir - C.red) / (C.nir + C.red))
- max((D.nir - D.red) / (D.nir + D.red))
from A, B, C, D
![Page 17: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/17.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Secured Archive Integration
First-ever direct, ad-hoc mix from protected NASA & ESA services
in OGC WCS/WCPS Web client (EarthServer + CobWeb)
![Page 18: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/18.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Web clients (m2m, browser)
Scalable Geo Service Architecture
OGC
WMS, WCS,
WCPS, WPS
distributed query
processingNo single point of failure
external
files
Internet
rasserver
databaseFile system
rasdaman
geo services
alternative
storage
![Page 19: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/19.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
Inset: Hadoop not the Answer to All
no builtin knowledge about structured data types
- “Since it was not originally designed to leverage the structure
[…] its performance […] is therefore suboptimal” [Daniel Abadi]
• M. Stonebraker (XLDB 2012): „will hit a scalability wall“
![Page 20: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/20.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
EarthServer: Datacubes At Your Fingertips
INSPIRE WCS :: ©2015 P. Baumann www.earthserver.eu
![Page 21: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/21.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
EarthServer: Datacubes At Your Fingertips
Agile Analytics on x/y/t + x/y/z/t Earth & Planetary datacubes
- EU rasdaman + NASA WorldWind
- Rigorously standards: OGC WMS + WCS + WCPS
- 100s of TB sites now, next: 1+ Petabyte per cube
Intercontinental initiative, 3+3 years:
EU + US + AUS
INSPIRE WCS :: ©2015 P. Baumann
Phase 1 reviewers:
"proven evidence" that rasdaman
will “significantly transform [how to]
access and use data“ …and "with
no doubt has been shaping the Big
Earth Data landscape” …
www.earthserver.eu
![Page 22: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/22.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
EarthServer Phase 1 & 2 Partners
www.earthserver.eu
![Page 23: DSD-INT 2015 - Datacubes understanding big eo data better - Peter baumann](https://reader031.vdocuments.site/reader031/viewer/2022020410/58e6618e1a28ab8d758b533d/html5/thumbnails/23.jpg)
Datacubes :: Delft SW Days :: ©2015 rasdaman
From data stewardship to service stewardship
Open-standard Coverage Cubes interoperability
- consensus across OGC, ISO, INSPIRE
EarthServer: agile analytics federation
- rasdaman Array DBMS
„A cube tells more than a million images“
Conclusion
[rasdaman/WW screenshots]