data center requirements and services
TRANSCRIPT
S.Kindermann(DKRZ)
StephanKindermann,MichaelLautenschlager,KatharinaBerger,TobiasWeigel,HansDieterHollweg
DeutschesKlimarechenzentrum(DKRZ)
DKRZ
Datacenterrequirementsandservices
S.Kindermann(DKRZ)
Overview
§ Update:thenewdatainfrastructurehosJngenvironmentatDKRZ
§ ESGF:DKRZdatalifecycleservices§ LTA/WDCC–ESGFintegraJon§ Qualityassurance§ Datanearprocessing§ TowardsPIDbasedservices
§ CMIP6atDKRZ
212/8/15
S.Kindermann(DKRZ)
DKRZdatacenterupdate
312/8/15
(pre-shutdown)ESGFinfrastructure
4datanodes
Indexnode
NoseparateDTNs
CERALTAinfrastructure
1CERA/ESGFdatanode
GPFS
HPSS
OracleDBcluster
CERAportal
Mistral
CERA/ESGFportal
2datanodes
LTA(Oracle)
LTAdatanode
2DTNs
LUSTRE
roNFS
DFN:2x3..5GBHH:2x10GB
HPC+InteracJvenodes+visualizaJonnodes
Openstackcloud
all:behindfirewall
VMs(XEN)
„na>onalMIPdatacache“• managementetc.tbd
Migra>ontonewintegratedHPC/datasystem• separateDTNs(starJng2016)• establishmentofa„naJonalMIPdataanalysiscache“• datacloudtosupportdataingestprocess
frommid2015from2016un>lend2015
S.Kindermann(DKRZ)
DKRZlongtermarchivalanddatacitaJon
Mayorusecase§ ReplicaJon§ Supportdataevalua>on§ QualityAssurance§ LongTermArchival§ DOIassignment§ ExposureasESGFdatanode
412/8/15
containercache
containerserver
CERA(Oracle)
LTA(HPSS)
CERAPortal/DDC..
ESGFQADOI
Process
replicaJonversioning
NaJonalclimatedatanode(MIPcache)
Datanearprocessing
ESGFshutdown
ESGFDatanode COGportalWPS
ingest
S.Kindermann(DKRZ)
WDCC/CERA/HPSSßàESGFintegraJon
512/8/15
ImprovedsystemforCMIP6:• FUSEbasedmounJngofDKRZHPSS/cachelegacysystem• ExtracJonofCERAmetadataforESGFmapfile• „standard“„standard“ESGFpublicaJoninan„offlinemode“
Opera>onalforCMIP5• CERAmetadata(Oracle)àESGFindex• ThreddsserverwithESGFsecurityfilter+HPSSdatacontainerserveràESGFdatanode
à FutureCOGportalvisibilityof(nonCMIP)WDCCLTAprojectdata
containercache
containerserver
CERA(Oracle)
LTA(HPSS)
FUSE
MapfilegeneraJon
COGportal
LTAESGFDatanode
ESGFSolrindex
ESGFIndexnode
ESGFPublisher
Postgres/THREDDS
S.Kindermann(DKRZ)
(CMIPdata)QualityAssuranceSoiware
612/8/15
mainFile
NetCDFFile
CFCon
ven>
onsT
ables
ProjectConfigura>on&Tables
User-m
odifiedDirec>ves
NC-APIM-DStore
CFConv.Checks
Annota>ons
QA
Time
Data
Consistencybetweensub-temporalfiles
DRSCV
VariableRequirements(CMOR)
ProjectRules
CFConven>onsCheck• Versions:1.4-1.6
• 8-9Chaptersofrules
• tablebasedconfig(area-type,cf-standard-name,stand-region-name,..)
• Sourcecode: hlps://github.com/h-dh/QA-DKRZ
• Pre-packagedversions:condabased,dockerbased
• Documenta>on:hlp://qa-dkrz.readthedocs.org/en/latest/qa-user-manual.html
Completelyre-structuredandmodularized:• FlexibleconfiguraJon• UsedheavilyforCORDEX–willsupport
CMIP6• Separatecf-checkermodule
S.Kindermann(DKRZ)
NaJonalMIPdataanalysiscache/node„Adhoc“approachàtransparentsolu>on:§ Dataneededàhelpdeskàdatamanager§ ROmountedonHPCdataanalysisnodes§ SupportfordataanalysisVMdeployment§ Supportfortooldependencymanagement(installrecipes,conda,docker)§ WPSframeworktosupportwebservicedeployments
§ Birdhouse(hlps://github.com/bird-house)§ conda/dockersupport§ SupportforhomeinsJtuJon(test-)deployments
712/8/15
ESGF
replicaJonversioning
NaJonalclimatedatanode(MIPcache)
Datanearprocessing
WPS
ingest
S.Kindermann(DKRZ)
Stablefile/collecJonmanagement!?
812/8/15 812/8/15
containercache
containerserver
CERA(Oracle)
LTA(HPSS)
CERAPortal/..
ESGFQADOI
Process
replicaJonversioning
NaJonalclimatedatanode(MIPcache)
Datanearprocessing
ESGFDatanode COGportalWPS
ingest
S.Kindermann(DKRZ)
TowardsPIDbasedservices
912/8/15
Mo>va>on:StableESGFdataspacebasedonPIDinfrastructure
Collabora>ons:• ePIC:DKRZpartneràprefixregistraJon• EUDAT:DKRZleadsPIDtaskàAPI• RDA:DKRZco-chairsPITandcollecJonsWGs• Envri+:PIDsinenvironmentalsciences
NextESGFsteps:• Test-Environment(PIDsystem+publisher)• Scalable,stablePIDassigment:• CMORintegraJon,CDNOTinvolvement• PIDAPI/ESGFpublisherintegraJon• HighavailablemessagequeuingsystemintegraJon
S.Kindermann(DKRZ)
Summary
Longtermarchivalusecaseà ESGFintegraJonà QualityAssuranceà PIDassigmentearlyindatalifecycleà earlycitaJonandDOIassignmentà futurePIDbaseddatamanagementservicesà futurePIDbasedenduserservicesà futurePIDbasedprovenancesupport
1012/8/15
S.Kindermann(DKRZ)
..
ThankYou
1112/8/15
S.Kindermann(DKRZ)
DKRZservices
Newdevelopments§ NewintegratedHPC/DataSystem
installedin2015,~50PByteLustre
§ Storagecloud(openstack)§ Communitydataanalysiscacheandplarorm
ESGF:§ WDCC/HPSS/ESGFdatanode
§ WPScomputeplarormbirdhouse
§ TowardsPID/earlycita>onservices
1212/8/15
dataingest
S.Kindermann(DKRZ) 1312/8/15
(Early)DataCitaJon(DM+ESGF)§ ImpactonCMIP6datamanagement(DM)
andESGFgovernance(ESGF)§ Requestfrommodellinggroupsforadata
citaJonreferencejustaierESGFdatapublicaJon
§ CMIP6datapublicaJonworkflow:
iCAS2015 1312/8/15
CMIP6citaJongranulariJesarecollecJonlevels:§ Simula>on§ Model