data center requirements and services

13
S. Kindermann (DKRZ) Stephan Kindermann, Michael Lautenschlager, Katharina Berger, Tobias Weigel, Hans Dieter Hollweg Deutsches Klimarechenzentrum (DKRZ) DKRZ Data center requirements and services

Upload: others

Post on 19-Oct-2021

5 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data center requirements and services

S.Kindermann(DKRZ)

StephanKindermann,MichaelLautenschlager,KatharinaBerger,TobiasWeigel,HansDieterHollweg

DeutschesKlimarechenzentrum(DKRZ)

DKRZ

Datacenterrequirementsandservices

Page 2: Data center requirements and services

S.Kindermann(DKRZ)

Overview

§  Update:thenewdatainfrastructurehosJngenvironmentatDKRZ

§  ESGF:DKRZdatalifecycleservices§  LTA/WDCC–ESGFintegraJon§  Qualityassurance§  Datanearprocessing§  TowardsPIDbasedservices

§  CMIP6atDKRZ

212/8/15

Page 3: Data center requirements and services

S.Kindermann(DKRZ)

DKRZdatacenterupdate

312/8/15

(pre-shutdown)ESGFinfrastructure

4datanodes

Indexnode

NoseparateDTNs

CERALTAinfrastructure

1CERA/ESGFdatanode

GPFS

HPSS

OracleDBcluster

CERAportal

Mistral

CERA/ESGFportal

2datanodes

LTA(Oracle)

LTAdatanode

2DTNs

LUSTRE

roNFS

DFN:2x3..5GBHH:2x10GB

HPC+InteracJvenodes+visualizaJonnodes

Openstackcloud

all:behindfirewall

VMs(XEN)

„na>onalMIPdatacache“•  managementetc.tbd

Migra>ontonewintegratedHPC/datasystem•  separateDTNs(starJng2016)•  establishmentofa„naJonalMIPdataanalysiscache“•  datacloudtosupportdataingestprocess

frommid2015from2016un>lend2015

Page 4: Data center requirements and services

S.Kindermann(DKRZ)

DKRZlongtermarchivalanddatacitaJon

Mayorusecase§  ReplicaJon§  Supportdataevalua>on§  QualityAssurance§  LongTermArchival§  DOIassignment§  ExposureasESGFdatanode

412/8/15

containercache

containerserver

CERA(Oracle)

LTA(HPSS)

CERAPortal/DDC..

ESGFQADOI

Process

replicaJonversioning

NaJonalclimatedatanode(MIPcache)

Datanearprocessing

ESGFshutdown

ESGFDatanode COGportalWPS

ingest

Page 5: Data center requirements and services

S.Kindermann(DKRZ)

WDCC/CERA/HPSSßàESGFintegraJon

512/8/15

ImprovedsystemforCMIP6:•  FUSEbasedmounJngofDKRZHPSS/cachelegacysystem•  ExtracJonofCERAmetadataforESGFmapfile•  „standard“„standard“ESGFpublicaJoninan„offlinemode“

Opera>onalforCMIP5•  CERAmetadata(Oracle)àESGFindex•  ThreddsserverwithESGFsecurityfilter+HPSSdatacontainerserveràESGFdatanode

à FutureCOGportalvisibilityof(nonCMIP)WDCCLTAprojectdata

containercache

containerserver

CERA(Oracle)

LTA(HPSS)

FUSE

MapfilegeneraJon

COGportal

LTAESGFDatanode

ESGFSolrindex

ESGFIndexnode

ESGFPublisher

Postgres/THREDDS

Page 6: Data center requirements and services

S.Kindermann(DKRZ)

(CMIPdata)QualityAssuranceSoiware

612/8/15

mainFile

NetCDFFile

CFCon

ven>

onsT

ables

ProjectConfigura>on&Tables

User-m

odifiedDirec>ves

NC-APIM-DStore

CFConv.Checks

Annota>ons

QA

Time

Data

Consistencybetweensub-temporalfiles

DRSCV

VariableRequirements(CMOR)

ProjectRules

CFConven>onsCheck• Versions:1.4-1.6

• 8-9Chaptersofrules

• tablebasedconfig(area-type,cf-standard-name,stand-region-name,..)

•  Sourcecode: hlps://github.com/h-dh/QA-DKRZ

•  Pre-packagedversions:condabased,dockerbased

•  Documenta>on:hlp://qa-dkrz.readthedocs.org/en/latest/qa-user-manual.html

Completelyre-structuredandmodularized:•  FlexibleconfiguraJon•  UsedheavilyforCORDEX–willsupport

CMIP6•  Separatecf-checkermodule

Page 7: Data center requirements and services

S.Kindermann(DKRZ)

NaJonalMIPdataanalysiscache/node„Adhoc“approachàtransparentsolu>on:§  Dataneededàhelpdeskàdatamanager§  ROmountedonHPCdataanalysisnodes§  SupportfordataanalysisVMdeployment§  Supportfortooldependencymanagement(installrecipes,conda,docker)§  WPSframeworktosupportwebservicedeployments

§  Birdhouse(hlps://github.com/bird-house)§  conda/dockersupport§  SupportforhomeinsJtuJon(test-)deployments

712/8/15

ESGF

replicaJonversioning

NaJonalclimatedatanode(MIPcache)

Datanearprocessing

WPS

ingest

Page 8: Data center requirements and services

S.Kindermann(DKRZ)

Stablefile/collecJonmanagement!?

812/8/15 812/8/15

containercache

containerserver

CERA(Oracle)

LTA(HPSS)

CERAPortal/..

ESGFQADOI

Process

replicaJonversioning

NaJonalclimatedatanode(MIPcache)

Datanearprocessing

ESGFDatanode COGportalWPS

ingest

Page 9: Data center requirements and services

S.Kindermann(DKRZ)

TowardsPIDbasedservices

912/8/15

Mo>va>on:StableESGFdataspacebasedonPIDinfrastructure

Collabora>ons:•  ePIC:DKRZpartneràprefixregistraJon•  EUDAT:DKRZleadsPIDtaskàAPI•  RDA:DKRZco-chairsPITandcollecJonsWGs•  Envri+:PIDsinenvironmentalsciences

NextESGFsteps:•  Test-Environment(PIDsystem+publisher)•  Scalable,stablePIDassigment:•  CMORintegraJon,CDNOTinvolvement•  PIDAPI/ESGFpublisherintegraJon•  HighavailablemessagequeuingsystemintegraJon

Page 10: Data center requirements and services

S.Kindermann(DKRZ)

Summary

Longtermarchivalusecaseà ESGFintegraJonà QualityAssuranceà PIDassigmentearlyindatalifecycleà earlycitaJonandDOIassignmentà futurePIDbaseddatamanagementservicesà futurePIDbasedenduserservicesà futurePIDbasedprovenancesupport

1012/8/15

Page 11: Data center requirements and services

S.Kindermann(DKRZ)

..

ThankYou

1112/8/15

Page 12: Data center requirements and services

S.Kindermann(DKRZ)

DKRZservices

Newdevelopments§  NewintegratedHPC/DataSystem

installedin2015,~50PByteLustre

§  Storagecloud(openstack)§  Communitydataanalysiscacheandplarorm

ESGF:§  WDCC/HPSS/ESGFdatanode

§  WPScomputeplarormbirdhouse

§  TowardsPID/earlycita>onservices

1212/8/15

dataingest

Page 13: Data center requirements and services

S.Kindermann(DKRZ) 1312/8/15

(Early)DataCitaJon(DM+ESGF)§  ImpactonCMIP6datamanagement(DM)

andESGFgovernance(ESGF)§  Requestfrommodellinggroupsforadata

citaJonreferencejustaierESGFdatapublicaJon

§  CMIP6datapublicaJonworkflow:

iCAS2015 1312/8/15

CMIP6citaJongranulariJesarecollecJonlevels:§  Simula>on§  Model