open-source scientific computing and data analytics using hdf

18
Aashish Chaudhary [email protected] Technical Leader with Patrick O’Leary, Dr. Rama Nemani (NASA), Chris Harris, Chris Kotfila, Doruk Aztek, Andrew Michaelis (NASA) Open-source Scientific Computing and Data Analytics using HDF July 24 th 2017 ESIP Summer

Upload: the-hdf-eos-tools-and-information-center

Post on 22-Jan-2018

135 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Open-source Scientific Computing and Data Analytics using HDF

Aashish Chaudhary

[email protected] Leader

withPatrick O’Leary,

Dr. Rama Nemani (NASA),

Chris Harris,

Chris Kotfila, Doruk Aztek,

Andrew Michaelis (NASA)

Open-source Scientific

Computing and Data Analytics

using HDFJuly 24th 2017

ESIP Summer

Page 2: Open-source Scientific Computing and Data Analytics using HDF

What We Do

at Kitware?

Open Source

and Open

Data is

strongly

encouraged

and practiced

at Kitware

Page 3: Open-source Scientific Computing and Data Analytics using HDF

It started with VTK

Page 4: Open-source Scientific Computing and Data Analytics using HDF

Parallel Processing and Rendering - Paraview

Page 5: Open-source Scientific Computing and Data Analytics using HDF

Computer Vision

Function (DARPA)

Images, Video, Point

Clouds

Recognitionby Function

Content-based

Retrieval

Event & Activity

Recognition

Anomaly Detection

3D Extraction and

Compression

Detection & Tracking

Page 6: Open-source Scientific Computing and Data Analytics using HDF

Medical Computing

Quantitative imaging Electronic health records

Vascular analysisSurgical guidance

And simulation

Digital pathology Orthopedic analysis

Longitudinal and

population shape

analysis

Interactive medical applications

and visualizations

Page 7: Open-source Scientific Computing and Data Analytics using HDF

Community Adaptation

Page 8: Open-source Scientific Computing and Data Analytics using HDF

HDF at Kitware

Climate Community High Performance Computing

Extensible Data Model and Format

- Developed to exchange

scientific data between HPC

codes and tools

- Heavy data is stored using

HDF5

Network Common

Data Form

(NetCDF)

- Most projects

use NetCDF4

Medical Community Vision Community

Leading-edge

algorithms for

registering and

segmenting

multidimensional data

Page 9: Open-source Scientific Computing and Data Analytics using HDF

ACME

The Accelerated Climate Modeling for Energy

(ACME) project is sponsored by the Earth System

Modeling (ESM) program (Biological and

Environmental Research) with eight national

laboratories and six partner institutions to develop

and apply the most complete, leading-edge climate

and Earth system models to challenging and

demanding climate-change research imperatives.

Most commonly used data format - NetCDF4

Data streaming using OpenDAP

Python Interface for most of the tools

Page 10: Open-source Scientific Computing and Data Analytics using HDF

OpenNEX

NEX is a platform for scientific

collaboration, knowledge sharing and

research for the Earth science

community

Global Daily Downscaled Projections (NEX-

GDDP, NetCDF4)

MODIS-Land and Atmosphere (HDF)

Page 11: Open-source Scientific Computing and Data Analytics using HDF

Web VisualizationData processing

Gaia

Gaia

Page 12: Open-source Scientific Computing and Data Analytics using HDF

Web VisualizationData processing

Pure JS?

Page 13: Open-source Scientific Computing and Data Analytics using HDF

HDF5 File Organization

Page 14: Open-source Scientific Computing and Data Analytics using HDF

Preprocessing Simulation Postprocessing

Page 15: Open-source Scientific Computing and Data Analytics using HDF
Page 16: Open-source Scientific Computing and Data Analytics using HDF

Possible Improvements

Streaming and Big Data analytics

- Any useful ingestion of HDF data

into cluster requires ETL pipeline

- For some tools, computation cannot

move close to the data, streaming

support is necessary in such cases

- Optimal read/write on cloud storage

Web-Support

- More tools and projects are moving

to support web-enabled data

analysis and visualization

- Pure JS implementation if possible

Page 17: Open-source Scientific Computing and Data Analytics using HDF

Summary

● HDF is widely data format for scientific computing, climate/geospatial

visualization, and in other domains at Kitware

● Recently we have started using HDF for information visualization

● We are looking forward to HDF usage on cloud and web-environment

● Kitware is always looking for strong open source collaborations and is

committed to push open-source scientific computing to its next level

Page 18: Open-source Scientific Computing and Data Analytics using HDF

Information

Aashish Chaudhary: [email protected]

LinkedIn: www.linkedin.com/in/aachaudhary

Kitware: http://www.kitware.com

NASA-NEX: https://nex.nasa.gov/nex

Kitware-AIST: https://github.com/OpenGeoscience/nex

HPC Cloud : http://www.kitware.com/publications/item/view/1784

HPCloud Github: https://github.com/Kitware/HPCCloud