parallel geospatial data management for multi-scale environmental data analysis on gpus

Parallel Geospatial Data Management for Multi-Scale Environmental Data Analysis on GPUs

Visiting Faculty: Jianting Zhang, The City College of New York (CCNY), jzhang@cs.ccny.cuny.eduORNL: Host: Dali Wang, Climate Change Science Institute (CCSI), wangd@ornl.gov

OverviewAs the spatial and temporal resolutions of Earth observatory data and Earth system simulation outputs are getting higher, in-situ and/or post- processing such large amount of geospatial data increasingly becomes a bottleneck in scientific inquires of Earth systems and their human impacts. Existing geospatial techniques that are based on outdated computing models (e.g., serial algorithms and disk-resident systems), as have been implemented in many commercial and open source packages, are incapable of processing large-scale geospatial data and achieve desired level of performance. Partially Supported by DOE’s Visiting Faculty Program (VFP), we are investigating on a set of parallel data structures and algorithms that are capable of utilizing massively data parallel computing power available on commodity Graphics Processing Units (GPUs). We have designed and implemented a popular geospatial technique called Zonal Statistics, i.e., deriving histograms of points or raster cells in polygonal zones. Results have shown that our GPU-based parallel Zonal Statistic technique on 3000+ US counties over 20+ billion NASA SRTM 30 meter resolution Digital Elevation (DEM) raster cells has achieved impressive end-to-end runtimes: 101 seconds and 46 seconds a low-end workstation equipped with a Nvidia GTX Titan GPU using cold and hot cache, respectively; and ,60-70 seconds using a single OLCF TITAN computing node and 10-15 seconds using 8 nodes.

High-resolution Satellite ImageryT

In-situ Observation Sensor Data

Global and Regional Climate Model Outputs

Data Assimilation Zonal Statistics

TEcological, environmental and administrative zones

Temporal Trends

High-End Computing Facility

Thread Block

Background & Motivation

Zonal Statistics on NASA Shuttle Radar Topography Mission (SRTM) Data

SQL: SELECT COUNT(*) from T_O, T_Z WHERE ST_WITHIN (T_O.the_geom,T_Z.the_geom) GROUP BY T_Z.z_id;

Point-in-Polygon Test

For each county, derive its histogram of elevations from raster cells that are in the polygon.

•SRTM: 20*109 (billion) raster cells (~40GB raw, ~15GB compressed TIFF)•Zones: 3141 counties, 87,097 vertices

Brute-force Point-in-polygon test•RT=(#of points)*(number of vertices)*(number of ops per p-in-p test)/(number of ops per second) =20*109*87097*20/(10*109)=3.7*106seconds=40days•Using up all Titan’s 18,688 nodes: ~200 seconds•Flops utilization is typically low: can be <1% for data intensive applications (typically <10% in HPC)

Hybrid Spatial Databases + HPC ApproachObservation : only points/cells that are close enough to polygons need to be tested Question: how do we pair neighboring points/cells with polygons?

Minimum Bounding Boxes (MBRs)

Step1: divide a raster tile into blocks and generate per-tile histograms Step 2: derive polygon MBRs and pair MBRs with blocks through box-in-polygon test (inside/intersect)Step 3: aggregate per-blocks histograms into per-polygon histograms if blocks are within polygonsStep 4: for each intersected polygon/block pair, perform point(cell)-in-polygon test for all the raster cells in the blocks and update respective polygon histogram

(A) Intersect (B) Inside (C) Outside

GPU Implementations

Identifying parallelisms and mapping to hardware(1)Deriving per-block histograms(2)Block in polygon test(3)Aggregate histograms for “within” blocks (4)Point-in-polygon test for individual cells and update

GPU Thread BlockRaster Block

AtomicAdd

•Point-in-poly test for each of cell’s 4 corners•All-pair Edge intersection tests between polygon and cell 2

Polygon vertices

•Perfect coalesced memory accesses •Utilizing GPU floating point power

BPQ-Tree based Raster compression to save disk I/O•Idea: chop a M-bit raster into M binary bitmaps and then build a quadtree for each bitmap (Zhang et al 2011)

•BPQ-Tree achieves competitive compression ratio but is much more parallelization friendly on GPUs •Advantage 1: compressed data is streamed from disk to GPU without requiring decompression on CPUs reducing CPU memory footprint and data transfer time•Advantage 2: can be used in conjunction with CPU-based compression to further improve compression ratio

Experiments and EvaluationsTile # dimension Partition Schema1 54000*43200 2*22 50400*43200 2*23 50400*43200 2*24 82800*36000 2*25 61200*46800 2*26 68400*111600 4*4Total 20,165,760,000 36

Data Format Volume (GB)Original (Raw) 38TIFF Compression 15gzip compression 8.3BPQ-Tree Compression 7.3BPQ-Tree+ gzip compression

Single Node Configuration 1: Dell T5400 WS•Intel Xeon E5405 dual Quad-Core Processor (2.00 GHZ), 16 GB, PCI-E Gen2, 3*500GB 7200 RPM disk with 32M cache ($5,000)•Nvidia Quadro 6000 GPU, 448 Fermi core ( 574 MHZ), 6 GB, 144GB/s ($4,500)

Single Node Configuration 2: Do-IT Yourself Workstation•Intel Core-i5 650 Dual-Core, 8 GB, PCI-E Gen3, 500GB 7200 RPM disk with 32M (recycled), ($1000)•Nvidia GTX Titan GPU, 2688 Kepler core ( 837 MHZ), 6 GB, 288 GB/s ($1000)

Parameters:•Raster chunk size (coding): 4096*4096• Thread block size: 256

•Maximum histogram bins: 5000•Raster Block size: 0.1*0.1 degree 360*360 (resolution is 1*1 arc-second)

GPU Cluster : OLCF Titan

ResultsSingle Node Cold Cache Hot cacheConfig1 180s 78sConfig2 101s 46s

Quadro 6000 GTX Titan 8-core CPU(Step 0): Raster decompression 16.2 8.30 131.9Step 1: Per-block histogramming 21.5 13.4 /Step 2: Block-in-polygon test 0.11 0.07 /Step 3: “within-block” histogram aggregation 0.14 0.11 /Step 4: cell-in-polygon test and histogram update 29.7 11.4 /total major steps 67.7 33.3 /Wall-clock end-to-end 85 46 /

Titan#of nodes 1 2 4 8 16Runtime(s) 60.7 31.3 17.9 10.2 7.6

Data and Pre-processing

parallel geospatial data management for multi-scale environmental data analysis on gpus

Documents

the future geospatial data infrastructure of the ksa ... ·...

tutorial: importing geospatial data

geospatial data types

the role of geospatial technology in bridging the gap...

facilitating collaboration on geospatial data using … ·...

ecuadorian geospatial linked data

geospatial data sharing pdkk

effective discovery of geospatial data: a geospatial...

de geospatial data coordination procedure › femaportal ›...

fugro geospatial - turning spatial data into knowledge...

geospatial data and key characteristics of geospatial data...

rt geospatial processing with nvidia gpus, storm, hyperdex |...

geospatial “unstructured data” dan rickman geospatial sg

geospatial data and systems.pdf

army geospatial center enabling geospatial information...

geospatial attribute data

archiving geospatial data

geospatial data and systems

big geospatial data processing

satellite communications, geospatial data and geo portal ......