exploiting weather & climate data at scale (wp4) · wp3 usability wp4 exploitability the...

40
Exploiting Weather & Climate Data at Scale (WP4) Julian Kunkel 1 Bryan N. Lawrence 2,3 Jakob Luettgau 1 Neil Massey 4 Alessandro Danca 5 Sandro Fiore 5 Huang Hu 6 1 German Climate Computing Center (DKRZ) 2 UK National Centre for Atmospheric Science 3 Department of Meteorology, University of Reading 4 STFC Rutherford Appleton Laboratory 5 CMCC Foundation 6 Seagate Technology LLC ESiWACE GA, Dec 2017

Upload: others

Post on 10-Jul-2020

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Exploiting Weather & Climate Data at Scale(WP4)

Julian Kunkel1 Bryan N. Lawrence2,3 Jakob Luettgau1

Neil Massey4 Alessandro Danca5 Sandro Fiore5 HuangHu6

1 German Climate Computing Center (DKRZ)2 UK National Centre for Atmospheric Science

3 Department of Meteorology, University of Reading4 STFC Rutherford Appleton Laboratory

5 CMCC Foundation6 Seagate Technology LLC

ESiWACE GA, Dec 2017

Page 2: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Outline

1 Introduction

2 Team and Tasks

3 T1: Costs

4 T2: ESDM

5 T3: SemSL

6 Dissemination

7 Next StepsDisclaimer: This material reflects only the author’s view and the EU-Commission is

not responsible for any use that may be made of the information it contains

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 2 / 23

Page 3: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Project Organisation

WP1 Governance andEngagementWP2 Global high-resolutionmodel demonstratorsWP3 Usability

WP4 Exploitability

■ The business of storing andexploiting high volume data

■ New storage layout for Earth systemdata

■ New methods of exploiting tape

WP5 Management andDisssemination

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 3 / 23

Page 4: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Community GoalsThis is what we said in 2012:

Ensemble of GlobalHigh Resolution, High Complexity

2012

Ensemble of 10 km Global Models

Ensemble of 100 km Global Models

Ensemble of 20-30 km Global Models of higher complexity

Ensemble of 10 km Regional Models, nested in:

Ensemble of 1 km Regional Models, nested in:

Ensemble of 2.5 km Regional Models, nested in:

2027

2022

2017

Impacts

Impacts

Impacts

Develop C

omm

on Fra

mework

for G

lobal M

odelscapacity

capabilit

yGlobal Model

Integrations

. . . consistent with needing an exascale machine!

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 4 / 23

Page 5: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

A modest (?) step . . .

Europe within a global model . . .

One "field-year" — 26 GB

1 field, 1 year, 6 hourly, 80 levels1 x 1440 x 80 x 148 x 192

One "field-year" — >6 TB

1 field, 1 year, 6 hourly, 180 levels1 x 1440 x 180 x 1536 x 2048

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 5 / 23

Page 6: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

. . . towards Exascale

2000 2004 2008 2012 2016 2020 2024 2028Date

100

101

102

103

104

105

106

107

Tera

byte

s Sto

red o

r Peak

Tera

flops

Availa

ble EXA

PETA

TERA

Predicted ExaFLOPS

Predicted ExaBYTES

(Data courtesy of Gary Strand, NCAR)

NCAR Storage and Compute

TB StoredPeak (Tflops)

2001 2005 2009 2013 2017 2021 2025 2029Date

100

101

102

103

104

105

106

107

108

Tera

byte

s Sto

red o

r Peak

Tera

flops

Availa

ble

EXA

PETA

TERA

Predicted ExaFLOPS

Predicted ExaBYTESExaFLOPSLong-Term

(Data courtesy of M. Lautenschlager, DKRZ)

DKRZ Storage and Compute

TB StoredPeak (Tflops)

2000 2004 2008 2012 2016 2020 2024 2028Year

100

101

102

103

104

105

106

107

108

Tera

byte

s Sto

red o

r Peak

Tera

flops

Availa

ble

EXA

PETA

TERA

Predicted ExaFLOPS

Predicted ExaBYTESExaFLOPSLong-Term

(Data courtesy of Mick Carter, UKMO)

UKMO Storage and Compute

TB StoredPeak (Tflops)

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 6 / 23

Page 7: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Heterogeneity in the Workflow Environment

Initial condition

or checkpoint

Final checkpoint

“Traditional” HPC Platform

Earth SystemModel

Simulation(periodic)

outputphysicalvariables

Multiple Tools,

Visualisation

post-processing(& analysis)

Distributed/Federated Archives (Servers/Public Clouds)

Dedicated Analysis Facilities (“Data HPC/Data Cloud”)

Reformatting,Sub-setting,

Downloading,Processing.

Multiple Roles, at least:Model Developer, Model Tinkerer, Expert Data Analyst, Service Provider, Data User

download

Multiple Tools(otherdata)

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 7 / 23

Page 8: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Issues and Actions

Issues

■ Cost: Disk prices not fallingas fast as they used to.

■ Behaviour: Larger groupssharing data for longer ⇒data is re-used for longer.

■ Performance: TraditionalPOSIX file systems notscalable for shared access.

■ Software: Little software forour domain which canexploit object storage anduse the public cloud.

■ Tape: Tape remainsimportant, particularly forlarge amounts of cold data.

ESiWACE Actions

■ Better understanding of costs andperformance of existing andnear-term storage technologies.

■ Earth System Middleware prototype— provides an interface between thecommonly used NetCDF/HDFlibrary and storage which addressesthe performance of POSIX and theusability of object stores (and more).

■ Semantic Storage Library prototype:— Python library that uses a“weather/climate” abstraction(CF-NetCDF data model) to allowone “file” to be stored across tiersof, e.g. POSIX disk, object store,and tape.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 8 / 23

Page 9: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Issues and Actions

Issues

■ Cost: Disk prices not fallingas fast as they used to.

■ Behaviour: Larger groupssharing data for longer ⇒data is re-used for longer.

■ Performance: TraditionalPOSIX file systems notscalable for shared access.

■ Software: Little software forour domain which canexploit object storage anduse the public cloud.

■ Tape: Tape remainsimportant, particularly forlarge amounts of cold data.

ESiWACE Actions

■ Better understanding of costs andperformance of existing andnear-term storage technologies.

■ Earth System Middleware prototype— provides an interface between thecommonly used NetCDF/HDFlibrary and storage which addressesthe performance of POSIX and theusability of object stores (and more).

■ Semantic Storage Library prototype:— Python library that uses a“weather/climate” abstraction(CF-NetCDF data model) to allowone “file” to be stored across tiersof, e.g. POSIX disk, object store,and tape.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 8 / 23

Page 10: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Work Package 4 — Exploitability (of data)

Partners

DKRZ, STFC, CMCC, Seagate, UREAD

ECMWF was originally a partner, but we removed the relevant task in the repro-filing following the review

Task 4.1

Cost and Performance

DocumentationFormal deliverableproduced, ongoingwork for publicationand dissemination.

Task 4.2

New Storage Layout

SoftwareESD MiddlewareFormal software de-sign delivered, work onbackends underway.

Task 4.3

New Tape Methods

SoftwareSemantic Storage LibPrototype pieces inplace.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 9 / 23

Page 11: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Methodology

Simple models

High-level representationof hardware / softwarecomponents.Includes:

■ performance,

■ reslience

■ cost

DeliverableScenarios discussing architecturalchanges for data centres, andimplications for cost/performance

Rack 1

Data center power supplyavail: 99.999%mtbf: 150 days

mttr: 1 min

Rack UPSavail: 99.999999%

+20min

Switchavail: 99.9999%

mttr: 1 day

Rack power supply

Rack switch

Compute nodeStorage server

RAID+: 1+0

HDD1mtbf: 10 y

HDD2mtbf: 10 y

HDD3mtbf: 10 y

HDD4mtbf: 10 y

Rack 1

Switch

Rack switch

20GB/s

Rack switch

20GB/s

Compute node memory: 10x 8 GB DIMMs

cpu: 2 x 10 @ 2Ghz

10GB/s

Compute node memory: 10x 8 GB DIMMs

cpu: 2 x 10 @ 2Ghz

10GB/s

Storage server

15GB/s

RAID 0RAID 1 RAID 1

HDD1capacity: 4TB latency: 5ms

r: 200MB/sw: 100MB/s

HDD2capacity: 4TB latency: 5ms

r: 200MB/sw: 100MB/s

HDD3capacity: 4TB latency: 5ms

r: 200MB/sw: 100MB/s

HDD4capacity: 4TB latency: 5ms

r: 200MB/sw: 100MB/s

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 10 / 23

Page 12: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Performance Modelling

Detailed Modelling

A simulator has beendeveloped, covering:

■ Hardware, software:tape drives, library,cache

■ Can replay recordedFTP traces

■ Validated with DKRZenvironment

UsageAim to use to evaluateperformance and costs offuture storage scenarios.

Client GroupClient

Tape Silo

Shared Cache

Switch

I/O

Servers...

Cache

Switch

Drive

Drive

Drive Drive

Drive Drive

Drive

Drive

Netw

ork

I/O

Sch

ed

ulin

g

Tape Manager

File Manager

Direct RAIT

Cache Policies

Robot

Sched.

Library Topologies

Workload

Generation

Load Balancing

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 11 / 23

Page 13: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Some Results

Costs of storage for DKRZ

■ Tape: 12 e per TB/ year

■ Software licenses for tape aredriving the costs!

■ Parallel Disk: 28(36) e TB/year

■ Object storage: 12.5 e TB/year(without software license costs)

■ Cloud: 48 $ TB/year (onlystorage, access adds costs)

■ Idle (unused) data is animportant cost driver!

Can consider various scenarios

Compute Nodes3.1 PFLOPS/sEUR 15.75M

Memory200TB

Parallel File System52 PB @ 700 GB/s

130 ServersEUR 7.5M

Tape Archive500 PB @ 18 GB/s

EUR 5M

NetworkEUR 5.25M

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 12 / 23

Page 14: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Some Results

Costs of storage for DKRZ

■ Tape: 12 e per TB/ year

■ Software licenses for tape aredriving the costs!

■ Parallel Disk: 28(36) e TB/year

■ Object storage: 12.5 e TB/year(without software license costs)

■ Cloud: 48 $ TB/year (onlystorage, access adds costs)

■ Idle (unused) data is animportant cost driver!

Can consider various scenarios

Compute nodes

Memory

Object store

Network

Compute nodes

Memory NVRAM

Tape archive

Network

Compute nodes

Memory NVRAM

Object store

Network Cloud

Compute nodes

Memory NVRAM

Burst Buffer Tape archive

Network

Compute Nodes

Memory

Object storeTape Archive

Networknetwork

Compute NodesEUR 15.75M3.1 PFLOPS/s

Memory

Parallel File System Tape Archive

NetworkEUR 5.25M

We have yet to work through all ofthese (and others)!

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 12 / 23

Page 15: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

The problem space in more detail

Challenges in the domain of climate/weather

■ Large data volume and high velocity■ Data management practice does not scale & not portable

▶ Cannot easily manage file placement and knowledge ofwhat file contains.

▶ Hierarchical namespaces does not reflect use cases.▶ Bespoke solutions at every site!

■ Suboptimal performance & performance portability▶ Cannot properly exploit the hardware / storage landscape▶ Tuning for file formats and file systems necessary at the

application level

■ Data conversion is often needed▶ To combine data from multiple experiments, time steps, ...

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 13 / 23

Page 16: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

The problem space in more detail

Challenges in the domain of climate/weather

■ Large data volume and high velocity■ Data management practice does not scale & not portable

▶ Cannot easily manage file placement and knowledge ofwhat file contains.

▶ Hierarchical namespaces does not reflect use cases.▶ Bespoke solutions at every site!

■ Suboptimal performance & performance portability▶ Cannot properly exploit the hardware▶ Tuning for file formats and file systems necessary at the

application level

■ Data conversion is often needed▶ To combine data from multiple experiments, time steps, ...

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 13 / 23

Page 17: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Approach

Design Goals of the Earth System Data Middleware

1 Reduce penalties of shared file access2 Ease of use and deployment3 Understand application data structures and scientific metadata4 Flexible mapping of data to multiple storage backends5 Placement based on site-configuration and limited

performance model6 Site-specific (optimized) data layout schemes7 Relaxed access semantics, tailored to scientific data generation8 A configurable namespace based on scientific metadata

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 14 / 23

Page 18: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Approach (continued)

Expected Benefits

■ Expose/access the same data via different APIs■ Independent and lock-free writes from parallel applications■ Storage layout is optimised to local storage

▶ Exploits characteristics of storage, rather than one size streamof bytes fits all.

▶ To achieve portability, we provide commands to createplatform-independent file formats on the site boundary or foruse in the long-term archive (see also the SemSL).

■ Less performance tuning from users needed■ One data structure can be fully or partially replicated with

different layouts to optimize access patterns■ Flexible namespace (similar to MP3 library)

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 15 / 23

Page 19: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Architecture

Key Concepts

■ Applications workthrough existing(NetCDF library)Other interfacescould be supportedin the future.

■ New middlewarebetween HDFlibrary andstorageexposes informationto a "layoutcomponent" aboutthe availablestorage, and data isfragmentedaccordingly.

■ Data is thenwritten efficiently.

NetCDFNetCDF GRIB2

Layout component

User-level APIs

File system Object store ...

User-level APIs

Site-specificback-endsand

mapping

Data-type aware

file a file b file c obj a obj b

Site InternetArchival

CanonicalFormat

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 16 / 23

Page 20: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Architecture

Key Concepts

■ Applications workthrough existing(NetCDF library)Other interfacescould be supportedin the future.

■ New middlewarebetween HDFlibrary andstorageexposes informationto a "layoutcomponent" aboutthe availablestorage, and data isfragmentedaccordingly.

■ Data is thenwritten efficiently.

Tools and services

ESD

Application1

NetCDF4 (patched)

Application2 Application3

GRIB

HDF5 VOL (unmodified)

ESD interface

cp-esd esd-daemonesd-FUSE

Layout Datatypes

ESD (Plugin)

Performance model

Metadata backend Storage backends

Site configuration

RDBMSNoSQL POSIX-IO Object storage Lustre

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 16 / 23

Page 21: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Optimizing for Data Representation

Storage makes placement decisions exploiting the storage landscape

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 17 / 23

Page 22: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Backend Specific Optimization

Interplay of a IO scheduler, a layout component and storagespecific performance models.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 18 / 23

Page 23: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

First Results with POSIX backend

● ●

●●

● ●●

●●

●●

●●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●● ●

● ●

●●

●●

●● ●

●●

●●●

●●

●●

● ●

●●●●

●● ●

●●●

●●● ●

●●

● ●●●● ●●●

●●●

●●●

●● ●

●●●

●●●

● ●●●

●●●●

●●●●

●●●●

●●

●●●●

●●●●

●● ●●

●●

●●

● ●

●● ●

●●

●●

●●●

●●

●●

●●

●●

● ●

●●●

● ●●

●●

●● ●

●●

●●

●●

● ●●● ●

● ●●

●●

● ● ●●

● ●● ●●●●

●●●●● ●●

●●

●●● ●

● ● ●● ●●● ●● ●●● ●● ●●●

● ● ● ●●●

● ●●

●●

●●

●●

●●

●●

●● ●●

●●●

●●●●

●●●●

●●●●

●●●●

●●●●

●●●●

●●

● ●

●●●

●●

●●

●●●●

● ●●

● ●

●●

● ●

● ●

●●●

●● ●

●●

●●

●●

● ●●

●●

●●● ●●● ●●● ●●

● ● ●●● ●● ●●● ●

● ●●

●●●

●●●

●● ●●

●●●●

●●●●●

●●

● ●●

●●

● ● ●●

●●● ● ●

●●

●●● ●●● ●●

● ● ●●● ●●

● ●●

●● ●●●●●●

●●●●●●●

●●●

●●●●●●●●

●●

●●

●●

●●

●●●

●●

●●

●●●

●●●

●●

●●

● ●

●●

● ●

●●

●●●● ●●●

●●

● ● ●●

●●

● ●●

●● ●

● ●●

●●●

●●●

●●●●

●●●●

●●●●

●●●●

●●●●

●●

●● ●●● ●● ●●

● ●●●●

● ● ●●●

●●

● ●

●●

●●●●●

●●● ●●●

●●

●●●

● ● ●● ● ●●●

● ●

●●● ●● ● ● ●● ●● ●●

●● ●●

●●

●●●

●●

● ●

● ●●●

● ●

● ●

●●●●

●●●●

●●●●

●●●●

●●●●

●●●●

●● ●● ●●●● ●● ●●●● ●

● ●●●

●●● ●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●●●

●● ●● ●● ●●

● ●●

● ●●●● ● ●● ●●●● ●●

●●●

●● ●●

● ●●● ●●●●●

● ●

●●●

●●●●●●

●●●●●●

●●●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

● ●

●●

●●●●

● ● ●

● ●

● ●

●● ●

●●

●● ●

● ●● ●● ●

●● ●●

●●

● ●

●●

●●●

●●●●

●●●

●●●●

●●●●

●●● ●●

●●● ●●●

●●●●● ●●●

●●●

●●

●●

●●

●●

●●

●●

●● ●

● ●●●

● ●●

● ●●●●●

● ●●●● ●● ●

●● ●

●● ●●

●●●

●●●●

●●●

●●●●

●●●●

●●●●

●●●

●●●●●●●●

●● ●●● ●●

● ●●

●● ●●● ●●

●● ●●●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●●● ●●

●●●● ●

● ●●

●●●

● ●●●●

● ●●●● ●●● ● ●●● ●●● ●●

●●●●

●●●●

●●●●●●

●●●

●●●

●●●●

●●●

●●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●●

●●●

●●●●

●●●●

●●●●

●●●●

●●

●●●●

●●●●● ●●

●●

●●

●●

●●

● ●

● ●●●

●●●

●●

●● ●●● ●

●●● ●● ●●●● ●● ●

●●●●

●●

●●●●

●●●

●●●●

●●●●

●●●

●● ●●● ●●● ●●●

●●●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●●

● ●●

●●

●● ●● ●●

● ●

●●●

●●

●●●●●●

●●●●●●●●

●●●

● ●●

●●

●●●

●●

● ●●

●●

● ●

●●

●●● ●

●●

●●

●●

●●

●●

●● ●

●●

●● ●

●●●●●● ●

● ●

●●●

●●●●

●●●●

●●

●●●●

●●●●

●●

●●

●●

●●● ●

●●

●●

●● ●●

●●

●●

●●

●●

●●

● ●●

●●●

●●

● ●●● ●●● ● ● ●

● ●●●

●● ●

●●●●

●●●

●●●●

●●●

●●●●

●●●

●●

●●

● ●

●●●●●

●●● ●●●

●●

● ●●● ●

● ● ●

●●●

● ●

●●● ●

●●

●●

●●

● ●

●● ●●●

●● ●●

●●

●●

●● ●●

●●●●

●●●

●●● ●

●●

●●●●

●●

Nodes: 1 Nodes: 2 Nodes: 4 Nodes: 8 Nodes: 16

2 MiB

16 MiB

128 MiB

0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30

10

1 k

100 k

10

1 k

100 k

10

1 k

100 k

Nodes + PPN

Trou

ghpu

t MiB

/s

tier●

lustre

shm

ssd

lustre−multifile

adaptive

Each facet shows the measurements for a different number of nodes (columns) and varying checkpoint size (rows).Write

●●

●●

●●

●●

● ●●●

●●

●●

●●●

●●●

●●

●●

● ●●

● ●●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●●●●

●●●●

●●●●

●●

●●●●

●●●●

● ●●

● ●

●●●

●●

●● ●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●●

● ●

●●●●

● ●●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●● ●●

●●● ●

●●

●●

●●

●●●

●● ●●

●●● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●●●●

●●●

●●●●

●●●●

●●●●●●●●

●●

●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

●● ●

●●

●● ●●

● ●

●●●

●●

●●

●●●

●●

●●

●● ●●●

●●

●●●

●●

●●

●●●

●●●

●●

●●●

●●●

●●

●●

●●●

●●

●●

●●

●●●●

●●●

●●●●

●●●

●●●●

●●●●

●● ●●●

●● ●● ●●

●●

●●●

● ●●

●●●

●●●

● ●

● ●

● ●

●●

●●● ●

● ●

●●

● ●

●●

●●

●●

● ●

●●

●●●

●●●●

●●●●

●●●●

●●●●

●●●●

●●

●●

●●●

●● ●●

● ●●●●

●●

● ●

●●

●●●

●●

●●●

●●

●●

● ●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●●●

●●●●

●●●●●●●●

●●●●

●●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●●

●●●●

● ●

●●

●●

●●●●

●●●●

●●●●●●●●

●●●● ●●●●

●● ● ●●

●● ●

●●●

●● ●

●●

●● ● ●

● ●

●●

●●

●●

●●

●●●

●●

● ●

●●●

●●●

●●●●

●●●

●●●●

●●●●

●●● ●● ●

● ●● ●●

●● ●●●●

●●●

●●

● ●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●●●

●●

●●

●●

●● ●

●●

●●

●●

●●●

●●

●●●●●

●●●●

●●●●

●●●

●●●●

●●●●

●●● ●

●●● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●● ●●

●●●

●●●●●●

●●●●

●●●●●●

●●●●●

●●●

●●●●●

● ●●● ●●●

●●

●●

●●

●●● ●

●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●●●

●●●●

●●●●

●● ●●●●● ●●●

●●●● ● ●●

●● ●● ●●

● ●

●●●

●●

●●

●●

●●●

●●

●●

●●●●●

●●●

●●●●

●●●

●●●●

●●●●

●●

● ●● ●●● ●●● ●●●

●● ●●●

●●●

●●

●●

●●

●●●

●●

●●

●● ●

●●

●●●

●●

●●●●●

●●●

●●●● ●●●●

●●●●

●●●

●●

●●

●●●

●●●

●●

●● ●●

●●

●● ●

●●

●●●

●●

●●

●●●

●●

●●

●●●

●●●●

●●

●●●●

●●●●

●●● ●●

●●

●●

● ● ●●●●

●● ●●●

●●

● ●●

●●

●●

●●●

● ●●

●●

●●

●●

●●●●

●●●

●●●●

●●●

●●●●

●●●

●●

●●

●●●●●

●●● ● ●●●

●● ●

●●

●●

●●●

●●

●●

●●

●● ●

●●●

●●

●●

●●

●●●●●●●●

●●●●

●●

●●●● ●●●

Nodes: 1 Nodes: 2 Nodes: 4 Nodes: 8 Nodes: 16

2 MiB

16 MiB

128 MiB

0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30 0 10 20 30

1 k

100 k

1 k

100 k

1 k

100 k

Nodes + PPN

Trou

ghpu

t MiB

/s

tier●

lustre

shm

ssd

lustre−multifile

adaptive

Each facet shows the measurements for a different number of nodes (columns) and varying checkpoint size (rows).Read

Adaptive Tier Selection for HDF5/NetCDF without requiringchanges to existing applications. (SC17 Research Poster).

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 19 / 23

Page 24: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Other backends – DDN Object Store (CMCC)

WOS Progress

■ First draft of layoutinterfaces.

■ Developed Cwrapper for theC++ DDN WOSlibraries with adirect mapping.

■ Designed a parallelapproach for inde-pendent/multiplewrite operations onWOS storage.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 20 / 23

Page 25: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Other backends – Seagate CLOVIS

Seagate Progress

■ Data structures andinterfaces designed forESDM to access objectsin Mero with ClovisAPIs.

■ First draft code.

■ Read & writeblock-aligned regionsfrom Mero cluster viaESDM requests, inparallel.

■ In the future: Seagatewill be working onoptimisation,performance andscalability in stablecode.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 21 / 23

Page 26: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Deployment Testing Example

Test and DeploymentOphidia as a test applicationfor ESDM

■ Import and ExportOphidia operatorsadapted for integrationwith ESDM storage

■ In-memory dataanalysis benchmarkusing ESDM

GRIB support■ Extend Ophidia import/ export operators to provide GRIB support

(implementation expected next year).

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 22 / 23

Page 27: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

ESDM Status & Roadmap

■ Done: ESDM Architecture Design for Prototype■ Done: Proof of concept for adaptive tier selection■ 70%: HDF5 VOL Plugin as Application to ESDM Adapter■ 30%: ESDM Core Implementation as Library■ 20%: Backend Plugins for POSIX, Clovis, WOS■ Q1 2018: Backend for POSIX, Metadata in MongoDB■ Q1 2018: Benchmarking at sites■ Q2 2018: Backends for Clovis, WOS■ Q4 2018: Production version with site-specific mappings

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 23 / 23

Page 28: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

The problem space in more detail

Challenges in the domain of climate/weather

■ Large data volume and high velocity■ Data management practice does not scale & not portable

▶ Cannot easily manage file placement and knowledge ofwhat file contains.

▶ Hierarchical namespaces does not reflect use cases.▶ Bespoke solutions at every site!

■ Suboptimal performance & performance portability▶ Cannot properly exploit the hardware▶ Tuning for file formats and file systems necessary at the

application level

■ Data conversion is often needed▶ To combine data from multiple experiments, time steps, ...

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 24 / 23

Page 29: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Approach

Design Goals of the Semantic Storage Library

1 Provide a portable library to address user management of datafiles on disk and tape which▶ does not require significant sysadmin interaction, but▶ can make use of local customisation if available/possible.

2 Increase bandwidth to/from tape by exploiting RAID-to-TAPE.3 Exploit current and likely storage architectures (tape, disk

caches, POSIX and object stores).4 Can be deployed in prototype fast enough that we can use it

for the Exascale Demonstrator.5 Exploit existing metadata conventions.6 Can eventually be backported to work with the ESDM.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 25 / 23

Page 30: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Architecture

CFA Framework (https://goo.gl/DdxGtw)

1 Based on CF Aggregation framework proposed 6 years agohttps://goo.gl/K8jCP8.

2 Define how multiple CF fields may be combined into one largerfield (or how one large field can be divided).

3 Fully general and based purely on CF metadata.4 Includes a syntax for storing an aggregation in a NetCDF file

using JSON string content to point at aggregated files.

Two Key Components

1 S3NetCDF — a drop in replacement for NetCDF4-python.2 CacheFace - a portable drop-in cache with support for object

stores and tape systems.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 26 / 23

Page 31: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Architecture

CFA Framework (https://goo.gl/DdxGtw)

1 Based on CF Aggregation framework proposed 6 years agohttps://goo.gl/K8jCP8.

2 Define how multiple CF fields may be combined into one largerfield (or how one large field can be divided).

3 Fully general and based purely on CF metadata.4 Includes a syntax for storing an aggregation in a NetCDF file

using JSON string content to point at aggregated files.

Two Key Components

1 S3NetCDF — a drop in replacement for NetCDF4-python.2 CacheFace - a portable drop-in cache with support for object

stores and tape systems.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 26 / 23

Page 32: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

S3NetCDF (working title)

Object

Store

Each object

is a valid

NetCDF File

AppApp

S3netCDFS3netCDF

S3

API

Local

cache

Local

cache

File split following CFA conventions

CFA-netCDF subarray

CFA-netCDF subarray

CFA-netCDF subarray

CFA-netCDF master array

■ Master Array File is a NetCDF file containing dimensions and metadata for thevariables (including URLs to fragment file locations).

■ Master Array File can be in persistent memory or online, nearline, etc

■ NetCDF tools can query file CF metadata content without fetching them.

■ Currently serial, work on parallelisation underway.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 27 / 23

Page 33: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

CacheFace (working title)

POSIX

FS

Cache

POSIX

FS

Cache

Object StorePOSIX FS

Client

Object

Object

Object

Metadata

Search

PUT

GETTiered

Storage

Client

TapeSystem

R/W

G/P

PluginInterfaceto tape

CacheFace StatusPrototype pieces exit

■ Simple metadatasystem designed.

■ Cache system designedand prototype built thatcan use Minio interfaceto object store.

■ Another cache systembuilt which depends onour bespoke tapeenvironment(ElasticTape).

■ Work planned onintegration anddeveloping pluginconcept.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 28 / 23

Page 34: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Dissemination and Publications

■ SC16 Research Poster▶ Modeling and Simulation of Tape Libraries for Hierarchical

Storage Systems

■ PDSW-DISCS Workshop at SC16 WiP▶ Middleware for Earth System Data

■ HPC-IODC Workshop at ISC17 Paper▶ Simulation of Hierarchical Storage Systems for TCO and QoS

■ ISC17 Project Poster▶ Middleware for Earth System Data

■ PDSW-DISCS Workshop at SC17 WiP▶ Towards Structure-Aware Earth System Data Management

■ SC17 Research Poster▶ Adaptive Tier Selection for NetCDF/HDF5

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 29 / 23

Page 35: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

HPC-IODC Workshop at ISC17

Simulation of tape archives to improve hierarchical storage systemsand test novel integration of cold storage in data centers.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 30 / 23

Page 36: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

PDSW-DISCS Workshop at SC17

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 31 / 23

Page 37: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

SC17 Research Poster

Proof of concept and early work on site characterization.

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 32 / 23

Page 38: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

The landscape is (rapidly) changing

EnvironmentScheduler support for NonVolatile Memory. Differentmodes of use. (NEXTGENIO)

Dynamic on the fly file systems(BeeOND, ADA FS, evenCephFS) . . .

NetCDFOngoing proposals to addressthread safety etc - NetCDF willevolve.

Considering NCX format foroptimising READ-only access.Not HDF (but alongside HDF).

HDFH5Serv and HSDS/HDF-Cloud

Serving files via REST (storagecan be files or fragments), APIunchanged.(We are experimenting with thisin ESIWACE1 WP4)

ExaHDFMultiple formats under HDF5AP (ADIOS, NC3 etc)

Climate use case. Better HPCperformance, asyncio, datamodel support for cloudresolving model grids . . .

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 33 / 23

Page 39: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

Introduction Team and Tasks T1: Costs T2: ESDM T3: SemSL Dissemination Next Steps

Supporting the EU Exascale Vision and Beyond

To demonstrate benefits, we need better integration with WPs

Short Term Goals

■ "ESiWACE1 Hero Runs"should send (some) datato JASMIN for completeworkflow proof of principle.

■ Data should then befragmented using theSemSL so that some datais on tape and some ondisk, and users can controlwhat is where.

Medium Term Goals

■ Utilise the ESDM inside a largemodel run (WP2)

■ Consider how to connect ESDMoutput with WAN transfer andSemSL in workflow. (WP3)

■ Consider ESDM integration withother on the fly file systems,ExaHDF etc (we can do that)

■ How will we work with the ESDsites?

■ We will need to work out how to establish appropriate internalliaisons to make most of those things happen.

■ We would like to have some active discussion about how totake this forward !

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 34 / 23

Page 40: Exploiting Weather & Climate Data at Scale (WP4) · WP3 Usability WP4 Exploitability The business of storing and exploiting high volume data New storage layout for Earth system data

The ESiWACE project has received funding from the EuropeanUnion’s Horizon 2020 research and innovation programme undergrant agreement No 675191

Disclaimer: This material reflects only the author’s view and the EU-Commission is

not responsible for any use that may be made of the information it contains

Kunkel & Lawrence (WP4 Team) Status of WP4: Exploitability ESiWACE GA, Dec 2017 35 / 23