status of hpc infrastructure and nwp operation in jma ·  · 2015-11-17status of hpc...

26
Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi Numerical Prediction Division, Japan Meteorological Agency 1 16th Workshop on High Performance Computing in Meteorology

Upload: nguyendang

Post on 29-May-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Status of HPC infrastructure and NWP operation in JMA

Toshiharu TauchiNumerical Prediction Division,Japan Meteorological Agency

116th Workshop on High Performance Computing in Meteorology

Page 2: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Contents• Current HPC system and Operational suite• Updates of Operational Models• Research for future NWP systems

216th Workshop on High Performance Computing in Meteorology

Page 3: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

CURRENT HPC SYSTEMAND OPERATIONAL SUITE

316th Workshop on High Performance Computing in Meteorology

Page 4: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

HPC Growth at JMA

416th Workshop on High Performance Computing in Meteorology

Page 5: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

9th generation HPC system• HITACHI SR16000 model M1

– Two independent systems.• Primary : Operational NWP• Secondary : Backup and Development– Specifications

5

Node

CPU IBM POWER7 3.83GHz x 4# of cores 8Peak Performance 980.48GFlopsMain Memory 128 GB

TotalNum. of Nodes 864(432x2)Peak Performance 847(423.5x2)TFlopsMain Memory 108(54x2)TB

Operating system AIX7.116th Workshop on High Performance Computing in Meteorology

Page 6: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Computer System

6

24 km west of the JMA headquarters (Tokyo)

Page 7: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Building

7

Notice board on wall

On the ground

Quake‐absorbingstructure

Vibration control damper 

Page 8: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

CURRENT HPC SYSTEMAND OPERATIONAL SUITE

816th Workshop on High Performance Computing in Meteorology

Page 9: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Current NWP models at NPD/JMA Global Spectral

ModelGSM

Meso‐Scale ModelMSM

Local Forecast Model LFM

One‐weekEnsembleWEPS

TyphoonEnsembleTEPS

Objectives Short‐ and Medium‐range forecast

Disaster reductionAviation forecast

Aviation forecastDisaster reduction

One‐week forecast

Typhoon forecast

Forecast domain

Global Japan and its surroundings

(4080km x 3300km)

Japan and its surroundings

(3160km x 2600km)

Global

Horizontal resolution TL959(0.1875 deg) 5km 2km TL479(0.375 deg)

Vertical levels   / Model Top

1000.01 hPa

5021.8km

6020.2km

600.1 hPa

ForecastPeriods

(Initial time)

84 hours(00, 06, 18 UTC)

264 hours(12 UTC)

39 hours(00, 03, 06, 09, 12, 15, 

18, 21 UTC)

9 hours(00‐23 UTC hourly)

264 hours(00, 12 UTC)27 members

132 hours(00, 06, 12, 18 UTC)

25 members

Initial Condition

Global Analysis(4D‐Var)

Meso‐scale Analysis(4D‐Var)

Local Analysis (3D‐Var)

Global Analysiswith ensemble perturbations (SV)

916th Workshop on High Performance Computing in Meteorology

Red: Updates after installation of 9th generation HPC

Page 10: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Global Analysis

GAMeso‐scale Analysis

MALocal Analysis

LAAnalysis scheme 4D‐Var 3D‐Var

Analysis time 00, 06, 12, 18 UTC 00, 03, 06, 09, 12, 15, 18, 21 UTC hourly

Data cut‐off time

2 hours 20 minutes[Early Analysis]

11 hours 50 minutes (00, 12 UTC)7 hours 50 minutes (06, 18 UTC)

[Cycle Analysis]

50 minutes 30 minutes

Analysis domain

Global Japan and its surroundings(4080km x 3300km)

Japan and its surroundings(3160km x 2600km)

Horizontal resolution(inner‐model resolution)

TL959 / 0.1875 deg (TL319 / 0.5625 deg)

5 km(15 km)

5km

Vertical levels  / Model Top 100 levels up to 0.01 hPa 50 levels up to 21.8km 50 levels up to 21.8km

Assimilation window ‐3 hours to +3 hours of analysis time ‐3 hours to analysis time ‐

10

Current DA systems at NPD/JMA

16th Workshop on High Performance Computing in Meteorology

Red: Updates after installation of 9th generation HPC

Page 11: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Local Analysis / Forecast : 24times!!

TyphoonEPS

Global Analysis(Early)

Global Forecast

Meso Forecast

One‐WeekEPS

Global Analysis(Cycle) Meso Analysis

# Job group kinds 87

# Job groups 524

# Jobs 14,737

# Steps 183,353

Time(UTC) 00 01 02 03 04 05 06 07 08 09 10 11 12Time(LST) 9am 10am 11am noon 1pm 2pm 3pm 4pm 5pm 6pm 7pm 8pm 9pm

Daily schedule of operational suite:Daytime (00‐12UTC)

Num

ber o

f nod

es

Page 12: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

# Job group kinds 87

# Job groups 524

# Jobs 14,737

# Steps 183,353

Time(UTC) 12 13 14 15 16 17 18 19 20 21 22 23 00Time(LST) 9pm 10pm 11pm midnight 1am 2am 3am 4am 5am 6am 7am 8am 9am

Daily schedule of operational suite:Nighttime (12‐00UTC)

Num

ber o

f nod

es

Page 13: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Data Dependency between JGs

Time(UTC)

00 03 06 09 12 15 18 21 00

Da12Da18

Da00Da06

Ea00

Ef00

Ea06

Ef06

Ea12

Ef12

Ea18

Ef18

Ma00

Mf00

Ma03

Mf03

Ma06

Mf06

Ma09

Mf09

Ma12

Mf12

Ma15

Mf15

Ma18

Mf18

Ma21

Mf21

La 00La 00Lf00

La La 01Lf01

La La 02Lf02

La 03La 03Lf03

La La 04Lf04

La La 05Lf05

La La 06Lf06

La La 07Lf07

La La 08Lf08

La La 09Lf09

La La 10Lf10

La La 11Lf11

La 12La 12Lf12

La La 13Lf13

La La 14Lf14

La La 15Lf15

La La 16Lf16

La La 17Lf17

La La 18Lf18

La 19La 19Lf19

La 20La 20Lf20

La La 21Lf21

La 22La 22Lf22

La 23La 23Lf23

Da: Global Cycle Analysis Ea: Global Early AnalysisEf: Global Forecast

Ma:Mesoscale AnalysisMf: Mesoscale Forecast

La: Local AnalysisLf:Local Forecast

Global

Mesoscale

Local

Page 14: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Management system ofoperational jobs

14

DBMS for registration

DBMS for job management

Registration form (Microsoft Excel sheets with VBA macros):Information about job groups, jobs, datasets, executables, etc.Submitted when a new job is added, or existing datasets or executables are updated

Job control language (JCL):Converted into a shell script

Program build file‐format (PBF):Converted into a makefile tocompile executables

Job definition file (JDF):Information about a job group and jobsincluding the schedule (time to run),the order (preceding job groups and jobs),computational resources required, etc.

Utility programs to register informationand check the consistency, etc.

Job scheduler

16th Workshop on High Performance Computing in Meteorology

Page 15: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

UPDATES OF OPERATIONAL MODELS

1516th Workshop on High Performance Computing in Meteorology

Page 16: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

GSM/GA

MSM/MA

LFM/LA

WEPS

TEPS

DataUsage

Climate,Ocean,

Enviroment

2012 2013 2014

Mode

ls &

DA

s fo

r sh

ort

and

medi

am r

ange

fore

cas

t Cloud Scheme

RadiationScheme

UpgradeTL959L60 ‐> TL959L100some physical processes

Forecast periodextension  (11days)

Start Hourly operationDomain Expansion

Forecast periodextension  (11days)

Upgrage TL319 ‐> TL479Operate  twice a day

Upgrage TL319 ‐> TL479Ensemble size 11 ‐> 25

GNSS RO

RTTOV v10.2MHS over  land

AVHRR/AMVLEOGEO/AMV

GCOM‐W

Domain ExpansionAsia area Storm Surge Model

OceanWave EPS

OceanWave DA

JRA‐55extension

Update CTMfor Ozone

Upgrage of One‐month EPSTL159 ‐> TL319

DomainExpansion

AIRS,IASI

grounud‐basedGNSS‐ZTD etc.

ASUCA

Upgrade CTMfor Aerosol

Metop‐B

Forecast periodExtension

Major Updates of operational Models

16

Page 17: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

JMA’s 2‐km operational model‐ Local NWP system‐

• The Local NWP system provides 9‐hour period forecasts every hour.• In the system design, frequent updates of forecasts (24 times a day!!) 

assimilating the latest observation are highly emphasized. 

NWP model: The Local Forecast Model (LFM) has a 2-km horizontal grid spacing and 60 vertical layers.

Data assimilation system: The Local Analysis (LA) employs an analysis system based on the three dimensional variational data assimilation (3D-Var) at a 5-km resolution.

• The Local NWP system consists of two subsystems

Page 18: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

The Major upgrade of GSM on March 2014

• The number of vertical levels has been enhanced from 60 to 100.– Giving weight to Troposphere and high 

weight to Tropopause.

• The model top level has been raised from 0.1 hPa to 0.01 hPa.

• The shorter time Step: 600s 400s.• And various physical processes have 

been upgraded.

18

L100

L60

10 20

Current GSMPrevious GSM

16th Workshop on High Performance Computing in Meteorology

Page 19: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

RESEARCH FOR FUTURE NWP SYSTEMS

1916th Workshop on High Performance Computing in Meteorology

Page 20: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Research for future NWP systems• Development of new non‐hydrostatic model “ASUCA”

• Research for future global NWP model

2016th Workshop on High Performance Computing in Meteorology

Page 21: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Development of “ASUCA”

• The Japan Meteorological Agency (JMA) is operating a non‐hydrostatic regional model (JMA‐NHM) .

• JMA‐NHM has been developed since 1980’s.– It is well tested and checked but ...

• The dynamical core of JMA‐NHM is almost retained while a lot of physical processes are developed.

• It is extended for many years … model codes are not simple.

• The recent rapid increase in market share of scalar multi‐core architecture machines is noticeable.

… these have motivated us to renovate the model

16th Workshop on High Performance Computing in Meteorology 21

Page 22: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

22

Comparison of the specification of the dynamical core between ASUCA and JMA‐NHM

ASUCA JMA‐NHM

Governing equations Flux formFully compressible equations

Quasi flux formFully compressible equations

Prognostic variables ρu, ρv, ρw, ρθm, ρ ρu, ρv, ρw, θ, p

Spatial discretization Finite volume method Finite difference Method

Time integration Runge‐Kutta 3rd   (long and short)

Leapflog with time filter (long)Forward backward (short)

Treatment of sound Conservative Split explicit  Split explicit

Advection Combining 3rd and 1st order upwind with flux limiter by Koren(1993)

4th (hor.) and 2nd(ver.) order with advection correction

Numerical diffusion None 4th order linear and nonlinear diffusion

Treatment of rain‐drop Time‐split Box‐Lagrangian

Coordinate Generalized coordinate orConformal mapping + Hybrid‐Z

Conformal mapping (hor.)Hybrid – Z (ver.) 

Grid Arakawa‐C (hor.)Lorentz (ver.)

Arakawa‐C (hor.)Lorentz (ver.)

Accurate mass conservation

Higher accuracyComputational efficiencyComputational stability

Page 23: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

23

Software design of ASUCA• To achieve higher efficiency on massive parallel scalar multi‐

core architecture– kij ‐ ordering

• Three‐dimensional arrays in space are stored sequentially in the order of z (k), x (i) and y (j).

– Aiming at low memory usage to improve cache efficiency– Advantageous to parallelize at outermost loop.

– reduce the number of MPI communication• Number of MPI communication needed for flux limiter scheme is less than the 

scheme used in JMANHM (4th order + adv. correction)• Subroutine for data stock before MPI comm. &  subroutine for MPI comm. of 

stocked data are separately prepared.• Procedure of diagnosing variables are collected up before the procedure of 

dynamics and physics, intending not to increase unnecessary doubled diagnosis and its sequential MPI comm.  

real(8) :: u(nz, nx, ny)

ASUCA JMA‐NHM

72600 138652

Asuumption of 1hour  forecast of LFM(dx=dy=2km)JMA‐NHM:dt=8,   asuca:dt=16

Number of calling MPI comm.

Page 24: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Schematic figure of single & multi I/O ranks

24

timeI/O ranksCompute ranks

Disk read

Scattering from I/O ranks to compute ranksGathering from compute ranks to I/O ranksDisk write

Computation

Waiting

single I/Orank case

16th Workshop on High Performance Computing in Meteorology

Page 25: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

Research for future global NWP model• Current

– GSM (TL959L100(‐20km))• Hydrostatic, semi‐implicit semi‐Lagrangian spherical harmonics spectral model.

• Plan for Next Generation HPC– GSM (specification: TBD)

• Plan to enhance the resolution, etc.

• Options for further future (Non‐Hydrostatic).– Spectral Model ?

• Non‐Hydrostatic Expansion of the current GSM ??• Using “Double Fourier” series ?? 

– Grid Model? • ASUCA‐ GLOBAL ??• Others ???

2516th Workshop on High Performance Computing in Meteorology

Page 26: Status of HPC infrastructure and NWP operation in JMA ·  · 2015-11-17Status of HPC infrastructure and NWP operation in JMA Toshiharu Tauchi ... Notice board on wall On the ground

THANKS FOR YOUR ATTENTION

2616th Workshop on High Performance Computing in Meteorology